Introduction Indian Asians have a twofold higher risk of cardiovascular disease compared to Europeans which is not explained by conventional cardiovascular risk factors or known genetic variants. The genetic architecture of Indian Asians has not previously been described. We hypothesised that whole genome sequencing of Indian Asians may identify both common and rare variants specific to this population that contribute to their increased cardiovascular disease risk.
Methods We carried out whole genome sequencing (mean depth 28.4×) in eight men of Indian Asian origin participating in the London Life Sciences Population (LOLIPOP) study. Sequencing was carried out using paired end and mate pair libraries on an Illumina GA2 machine. Read alignment was done by BWA, and variants called using GATK and SAMtools. Sensitivity for single nucleotide polymorphism (SNP) detection was assessed by comparison to whole genome data.
Results We identified 6 602 840 autosomal variants, 436 823 of which are novel SNPs. Of these, 50 585 appear to be common (present at least twice, corresponding to minor allele frequency >10%). We found 21 659 autosomal SNPs that were expected to affect protein coding, of which 2174 are novel. Among the coding SNPs identified, 145 are in genes linked to human diseases, such as obesity (FTO, UCP1), diabetes mellitus (CDKAL1, GCGR, HNF1B), lipid metabolism (APOB), hypertension (NOS2), and renal disease (NPHP4, PKD1). We also found 65 613 novel autosomal indels of which 35 097 are present at least twice, and 2301 novel deletions >100 bp. We show that >50% of the novel genetic variants are not in high LD (r2¡Ý0.8) with tag SNPs and hence not captured on available high-density microarrays.
Conclusions We identify more than 500 000 genetic variants not previously reported in 1000 genomes or dbSNP, and likely to be Indian Asian specific. The novel variants identified here are strong candidates for genetic factors underlying the increased risk of diabetes and cardiovascular disease among Indian Asians.
- Indian Asians
- whole genome sequencing
- population specific variants