Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

© Kamla-Raj 2008 Int J Hum Genet, 8(1-2): 97-118 (2008)

Genetic Imprints of Pleistocene Origin of Indian Populations:


A Comprehensive Phylogeographic Sketch of Indian
Y-Chromosomes
R. Trivedi1, Sanghamitra Sahoo1, 2, Anamika Singh1, G. Hima Bindu1, Jheelam Banerjee1,
Manuj Tandon1, Sonali Gaikwad1, Revathi Rajkumar1, T. Sitalaximi1, Richa Ashma1,
G.B.N. Chainy3 and V. K. Kashyap*1,2

1. National DNA Analysis Centre, Central Forensic Science Laboratory, 30, Gorachand Road,
Kolkata 700 014, West Bengal, India
2. National Institute of Biologicals, A-32, Sector 62, Institutional Area, Noida 201307,
Uttar Pradesh, India
3. PG Department of Biotechnology,Vani Vihar, Utkal University,
Bhubaneswar 751 004, Orissa, India
KEYWORDS Population genetics, people of india, linguistic groups, migration

ABSTRACT Paleoanthropological evidence indicates that modern humans reached South Asia in one of the first
dispersals out of Africa, which were later followed by migrations from different parts of the world. The variation of
20 microsatellite and 38 binary polymorphisms on the non-recombining part of the uniparental, hapliod Y-chromosome
was examined in 1434 male individual of 87 different populations of India to investigate various hypothesis of
migration and peopling of South Asia Sub-continent. This study revealed a total of 24 paternal lineages, of which
haplogroups H, R1a1, O2a and R2 portrayed for approximately 70% of the Indian Y-Chromosomes. The high NRY
diversity value (0.893) and coalescence age of approx. 45-50 KYA for H and C haplogroups signified an early
settlement of the subcontinent by modern humans. Haplogroup frequency and AMOVA results provide similar
evidence in support of a common Pleistocene origin of Indian populations, with partial influence of Indo-European
gene pool on the Indian society. The differential Y-chromosome and mt DNA pattern in the two Austric speakers of
India signaled that an earlier male–mediated exodus from South East Asia largely involved the Austro-Asiatic tribes,
while the Tibeto-Burman males migrated with females through two different routes; one from Burma most likely
brought the Naga-Kuki-Chin language and O3e Y-chromosomes and the other from Himalayas, which carried the
YAP lineages into northern regions of subcontinent. Based on distribution of Y-chromosome haplogroups (H, C, O2a,
and R2) and deep coalescing time depths for these paternal lineages, we propose that the present day Dravidian
speaking populations of South India are the descendants of earliest Pleistocene settlers while Austro-Asiatic speakers
came from SE Asia in a later migration event.

INTRODUCTION subcontinent also testify towards occupation of


India by early humans (Misra 2001). Most of the
The origins of modern humans in South Asia prevailing genetic records further corroborate with
have been obscure. Archeological and paleo- the hypothesis that Homo sapiens colonized
anthropological evidences are few and frag- South Asia as a part of an early southern dispersal
mentary. Human remains dating back to the Late from Africa (Quintana-Murci et al. 1999; Cann
Pleistocene provide limited but conclusive 2001; Macaulay et al. 2005). This paper examines
evidence for early human occupation in the Indian the current genetic diversity of Indian Y chromo-
subcontinent (Deraniyagala 1992; Kennedy 2000; somes in context to place the genetic origin/(s)
James et al. 2005). A number of artefacts of Middle and time of settlement of the earliest human
and Upper Paleolithic cultures in Narmada Valley populations in India.
and the remains of Acheulian culture have been The present-day populations of India belong
extensively found through out South Asia. to 4635 endogamous communities (Singh 1998)
Mesolithic microliths and evidences of Neolithic and speak as many as 350 living languages
settlements found in diverse parts of the (ethnologue), which fall under the four major
supra-language families, i.e., Indo-European,
*
Corresponding Author: Dr. V K Kashyap, Director Dravidian, Sino-Tibetan and Austric. The nature
National Institute of Biologicals, A-32, Sector 62, of extensive diversity among varied groups
Institutional Area, NOIDA 201307, India. reported with 54 classical markers showed a
Telephone: +91-120-2400027, Fax: +91-120-2403014
E-mail: [email protected] typically north-south geographic division of
98 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

populations and placed Indians closer to (Kivisild et al. 2003; Cordaux et al. 2004). The
European populations than either with east- present study is based on simultaneous analysis
Asians or Africans in the genetic distance trees of 38 SNP and 20-STR markers on the Y-
(Cavalli- Sforza et al. 1994). A number of studies chromosome to provide the age estimates and
based on mt DNA, Y-chromosome and other describe their phylogeographic distribution. Apart
nuclear DNA markers have invariably supported from determining antiquity of various populations
these observations. Numerous surveys of genetic groups in South Asia, we also discuss the genetic
variation have generally portrayed the differences structure and peopling of the subcontinent in
between caste and tribes, and the extent of gene light of present molecular evidences.
flow among ranked caste clusters. Most of the
studies conclude that maternal gene pool of SUBJECTS AND METHODS
Indian populations are proto-Asian in origin with
limited west-Eurasian admixture. While the Y- Populations Analyzed
chromo-somes of the caste populations were
found to be more similar to Europeans than A total of 1152 unrelated male individuals
Asians; with greater west-Eurasian admixture in belonging to 80 different populations were
castes of higher rank (Bamshad et al. 2001), recent analyzed in the present study. Samples include
studies provide congruent evidence against any populations from various linguistic families (Indo-
major influx of Indo-European speakers into the European, Austro-Asiatic, Dravidian and Tibeto-
Indian gene pool and have ascertained a late Burman) and sixteen geographical areas of India.
Pleistocene South Asian origin for majority of Blood samples were collected with informed
Indian populations (Sahoo et al. 2006; Sengupta consent using a protocol approved by the ethical
2006). These new findings are consistent with committee of CFSL, Kolkata. DNA was extracted
archeobotanical evidences (Fuller 2003) and using standard protocols (Sambrook 1989) from
linguistic data (Renfrew 1989) which suggest a peripheral blood lymphocytes. Information
recent common root for Elamite and Dravidic concerning their geographic origin, linguistic and
languages. It is hypothesized that the same socio-ethnic affiliation for each population is
prehistoric gene pool of southern Asian given in table 1. Additional data on 282 samples
Pleistocene coastal settlers from Africa provided from seven Indian populations (Punjab,
inocula for both Indian castes and tribes, and Konkanstha Brahmin, Koya, Yerava, Mullukunan,
subsequent diversification of the gene pools was Kuruchian, Koraga) and from 76 populations of
probably due to the genetic imprints laid down Western Europe (51), Russia (281), Middle East
by later migrants, such as Huns, Greeks, Kushans, (102), Caucasus (122), Central Asia (584), Siberia
Moghuls, and others (Kivisild et al. 2003). (66), North East Asia (334), South East Asia (552),
However, much speculation remains about which Oceania (225), Pakistan (691) and Sri Lanka (39)
of the population groups are amongst the earliest were collated from literature and included in the
settlers of the Indian subcontinent. While the genetic distance analysis.
Austro-Asiatic tribes have been presumed to be
descendants of the early modern humans based Markers Analyzed
on nucleotide diversity of mitochondrial M
haplogroup (Roychoudhary et al. 2001; Basu et 38 binary polymorphisms included in the
al. 2003), analysis of the Indian Y chromosomes present analysis have been previously described
undertaken in this study depicts a different (Sahoo et al. 2006). Analysis of 20 Y-STRs (twelve
scenario. tetranucleotide repeats DYS19, DYS385a/b, DYS
In this study, we have assessed a total of 1434 389I/II, DYS390, DYS391, DYS393, DYS460, H4,
unrelated male individuals belonging to 87 DYS437 and DYS439 and three trinucleotide
different Indian populations, of which 936 Y- repeats, DYS392, DYS388 and DYS426, two
chromosomes have been previously analysed for dinucleotide YCAIIa/b, two pentanucleotide
38 Y-SNP markers (Sahoo et al. 2006). Additional repeat loci, DYS438 and DYS447 and a hexa-repeat
216 samples are included in the present analysis, nucleotide DYS448) was carried out in the same
while Y-chromosomal haplogroup data for 282 DNA samples by using an in-house standardized
additional samples from seven other Indian protocol using primers described elsewhere
populations were collated from the literature (Butler et al. 2002). Y-STR haplotypes were
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 99
constructed in a sequential order of loci keeping the genetic affinity between the studied groups
an ascending numerical order for the minimal using MEGA 3.0. Genetic relationship between
haplotype to facilitate Y-chromosome compari- populations of India and other parts of the world
sons with other world populations. was estimated based on pairwise genetic
distances (FST values) calculated from haplogroup
Statistical Analyses frequencies. Multidimensional scaling (MDS)
analysis of pairwise FST values was performed
Several population genetic parameters, using XL STAT pro 7.5 to decipher the genetic
including mean haplogroup and haplotype affinities of populations. The Indian populations
diversity and their standard errors, mean number were pooled into their regional boundaries for
of pair wise differences (MPD), pairwise FST values comparison of genetic similarity with world
for haplogroups and associated p values were populations, to obtain a better resolution in the
calculated using ARLEQUIN ver.2.0 software MDS plot.
package (Schneider et al. 2000). To test for
differences in the proportions, χ 2- test for RESULTS
significance was employed. Apportionment of
genetic differences among various socio-ethnic, Approximately 1152 individuals belonging to
geographic and linguistic groups at different 80 extant human populations from 16
levels of hierarchical subdivisions; between geographical regions of India were analysed with
individuals within populations, between 38 Y-SNPs and 20 Y-STR markers to evaluate the
populations within groups and between groups possibility that Austro-Asiatic speakers are the
of populations were calculated using analysis of earliest settlers of the Indian subcontinent. 24
molecular variance (AMOVA) (Excoffier et al. different paternal lineages were observed, out of
1992). To examine the factor/(s) responsible for which, haplogroups H-M69, R1a1-M17, R2-M124
genetic differentiation of Y-chromosomes, and O2a-M95 together account for 69% of the
AMOVA was done both on binary markers as paternal diversity in South Asia. Another 20.9%
well as with Y-STRs within the lineages. of the genetic variation in Indian males is
Significance levels of the genetic variance described by haplogroups L-M11, J2-M172, O3e-
components as well as ÖST values were estimated M134, K2-M70, F-M89 and C-RPS4Y711, while the
by using 10000 iterations. presence of other haplogroups- R1b3-M269 and
Median-joining network algorithm of G-M201 could be attributed to recent admixture
haplogroup associated haplotypes (Bandelt et al., with Europeans.
1999; Forster et al., 2000) was performed using
the software NETWORK 4.1.0.8 version (Life Haplogroups and Extent of Y-Chromosome
Sciences and Engineering Technology Solutions Diversity in Indians
Web site), with epsilon value set to zero. For
network calculation, seven Y-STR (DYS19, The Y-SNPs used in this study were based on
DYS389I, DYS389II, DYS390, DYS391, DYS392 previous reports of polymorphisms in Eurasian
and DYS393) loci were used, where weightage to and Oceanic populations. Overall haplogroup
each locus was given according to the estimated diversity among Indians was relatively high when
variance. Y-STR loci with highest variance was compared to European or East Asian populations.
given the lowest weights. To estimate the time to Indian populations depicted diversity values from
the most recent common ancestor (TMRCA), we 0.133 to a high of 0.914, with Austro-Asiatic and
calculated the ages to STR variation within the Tibeto-Burman tribes generally showing reduced
correponding haplogroup observed in the Indian diversity (Table 1). Twenty-five Dravidian
populations using the average square difference populations showed a higher mean haplogroup
(ASD) method. We used the same seven Y-STRs diversity (0.723± 0.083) compared to Indo-
as those used in Network analysis and and a European speakers (0.684± 0.079) represented by
generation time of 25 years and mutation rate of thirty endogamous groups. South Indian groups;
6.9 X 10-4 as described by Zhivotovsky et al. Andhra Brahmins, Kallar, Raju, Chenchu and
(2004). Lambadi displayed high lineage diversity values,
Neighbor-joining tree based on FST values while populations of North India typically
of 87 Indian populations were used to illustrate demonstrated lower mean haplogroup diversity
Table 1: Description of the Indian populations included in this study 100
Population Code State Region Language Social Status Hierarchy Haplogroup Diversity
1 ANDHRABRAHMIN ANB ANDHRA PRADESH South Dravidian CASTE Upper 0.8538 ± 0.0504
2 CHENCHU CHU ANDHRA PRADESH South Dravidian TRIBE 0.8474 ± 0.0437
3 KAMMA CHAUDHARY KMC ANDHRA PRADESH South Dravidian CASTE Lower 0.6433 ± 0.1078
4 KAPPU NAIDU KPN ANDHRA PRADESH South Dravidian CASTE Lower 0.5737 ± 0.1213
5 KOMATI KOM ANDHRA PRADESH South Dravidian CASTE Lower 0.5000 ± 0.1222
6 LAMBADI LMD ANDHRA PRADESH South Indo-European TRIBE 0.8789 ± 0.0432
7 NAIKPOD GOND NPG ANDHRA PRADESH South Dravidian TRIBE 0.7421 ± 0.0584
8 RAJU RAJ ANDHRA PRADESH South Dravidian CASTE Middle 0.8596 ± 0.0393
9 REDDY RDY ANDHRA PRADESH South Dravidian CASTE Middle 0.8022 ± 0.0687
10 YERUKULA YER ANDHRA PRADESH South Dravidian TRIBE 0.6316 ± 0.0875
11 ADI PASI ADI ARUNACHAL PRADESH North-East Tibeto-Burman TRIBE 0.3556 ± 0.1591
12 BIHARBRAHMIN BBH BIHAR East Indo-European CASTE Upper 0.4510 ± 0.1174
13 BHUMIHAR BHU BIHAR East Indo-European CASTE Middle 0.6368 ± 0.1151
14 RAJPUT BRJ BIHAR East Indo-European CASTE Upper 0.4394 ± 0.1581
15 KAYASTHA BKY BIHAR East Indo-European CASTE Middle 0.7363 ± 0.0748
16 YADAV YAV BIHAR East Indo-European CASTE Lower 0.6786 ± 0.1220
17 KURMI KUI BIHAR East Indo-European CASTE Lower 0.8590 ± 0.0633
18 BANIYA BAN BIHAR East Indo-European CASTE Lower 0.7273 ± 0.0679
19 GUJ PATEL PAT GUJRAT West Indo-European CASTE Middle 0.7778 ± 0.1100
20 HP RAJPUT HRJ HP North Indo-European CASTE Upper 0.9143 ± 0.0425
21 OROAN ORO JHARKHAND East Dravidian TRIBE 0.4091 ± 0.1333
22 HO HO JHARKHAND East Austro-Asiatic TRIBE 0.0000 ± 0.0000
23 BHUMIJ BHJ JHARKHAND East Austro-Asiatic TRIBE 0.4571 ± 0.1406
24 KHARIA KRA JHARKHAND East Austro-Asiatic TRIBE 0.5333 ± 0.1801
25 MUNDA MUN JHARKHAND East Austro-Asiatic TRIBE 0.1429 ± 0.1188
26 BIRHOR BIR JHARKHAND East Austro-Asiatic TRIBE 0.1333 ± 0.1123
27 SANTHAL SAN JHARKHAND East Austro-Asiatic TRIBE 0.0000 ± 0.0000
28 IYENGAR IYN KARNATAKA South Dravidian CASTE Upper 0.7485 ± 0.0610
29 LINGAYAT LYN KARNATAKA South Dravidian CASTE Upper 0.8333 ± 0.0691
30 GOWDA GOW KARNATAKA South Dravidian CASTE Lower 0.9524 ± 0.0955
31 BHOVI BHV KARNATAKA South Dravidian CASTE Lower 0.7524 ± 0.0918
32 CHRISTIAN CHR KARNATAKA South Dravidian CASTE Lower 0.8333 ± 0.0597
33 MUSLIM MUS KARNATAKA South Dravidian CASTE Lower 0.8333 ± 0.2224
34 KURUVA KUR KARNATAKA South Dravidian TRIBE 0.7436 ± 0.0909
35 DESASTH BRAHMIN DSB MAHARASTRA West Indo-European CASTE Upper 0.8421 ± 0.0657
36 CHITPAVAN BRAHMIN CHB MAHARASTRA West Indo-European CASTE Upper 0.9048 ± 0.0405
37 MARATHA M AT MAHARASTRA West Indo-European CASTE Middle 0.8000 ± 0.0681
38 DHANGAR DGR MAHARASTRA West Indo-European CASTE Lower 0.8167 ± 0.0571
39 PAWARA PWR MAHARASTRA West Indo-European TRIBE 0.7417 ± 0.1053
40 KATKARI KTK MAHARASTRA West Indo-European TRIBE 0.8246 ± 0.0648
41 MADIA GOND MGD MAHARASTRA West Dravidian TRIBE 0.6264 ± 0.1098
R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.
Table 1: Contd....
Population Code State Region Language Social Status Hierarchy Haplogroup Diversity
42 MAHADEO KOLI MKL MAHARASTRA West Indo-European TRIBE 0.7636 ± 0.0833
43 MARA MRA MIZORAM North-East Tibeto-Burman TRIBE 0.5333 ± 0.0515
44 HMAR HMR MIZORAM North-East Tibeto-Burman TRIBE 0.2789 ± 0.1235
45 LAI LAI MIZORAM North-East Tibeto-Burman TRIBE 0.5455 ± 0.0615
46 LUSEI LUS MIZORAM North-East Tibeto-Burman TRIBE 0.4094 ± 0.1002
47 KUKI KUK MIZORAM North-East Tibeto-Burman TRIBE 0.6818 ± 0.0910
48 MANIPURI MUSLIM MMS MIZORAM North-East Tibeto-Burman CASTE Lower 0.8333 ± 0.0980
49 ORIYABRAHMIN OBH ORISSA East Indo-European CASTE Upper 0.8043 ± 0.0697
50 KARAN KRN ORISSA East Indo-European CASTE Middle 0.6471 ± 0.0953
51 KHANDAYAT KDY ORISSA East Indo-European CASTE Middle 0.7564 ± 0.0974
52 GOPE GPE ORISSA East Indo-European CASTE Lower 0.8333 ± 0.0720
53 PAROJA PRJ ORISSA East Dravidian TRIBE 0.8667 ± 0.0483
54 JUANG JUN ORISSA East Austro-Asiatic TRIBE 0.0000 ± 0.0000
55 SAORA SAR ORISSA East Austro-Asiatic TRIBE 0.6784 ± 0.0884
56 NEPALI NEP SIKKIM North-East Tibeto-Burman/ CASTE Upper 0.9048 ± 0.1033
Indo-European
57 BHUTIA BHT SIKKIM North-East Tibeto-Burman TRIBE 0.5000 ± 0.2652
58 CHAKKLIAR CHK TAMIL NADU South Dravidian CASTE Lower 0.7912 ± 0.0673
59 KALLAR KAL TAMIL NADU South Dravidian CASTE Middle 0.8788 ± 0.0751
EARLIEST SETTLERS OF INDIAN SUBCONTINENT

60 VANNIYAR VAN TAMIL NADU South Dravidian CASTE Middle 0.8333 ± 0.0597
61 PALLAR PAL TAMIL NADU South Dravidian CASTE Lower 0.7000 ± 0.0896
62 GOUNDER GOU TAMIL NADU South Dravidian CASTE Upper 0.7124 ± 0.0650
63 IRULAR IRU TAMIL NADU South Dravidian TRIBE 0.7576 ± 0.1221
64 KANYAKUBJ BRAHMIN KKB UTTAR PRADESH North Indo-European CASTE Upper 0.6909 ± 0.1276
65 UP JAT UPJ UTTAR PRADESH North Indo-European CASTE Upper 0.0000 ± 0.0000
66 UP THAKUR UPT UTTAR PRADESH North Indo-European CASTE Upper 0.5357 ± 0.1232
67 KHATRI KHT UTTAR PRADESH North Indo-European CASTE Middle 0.2857 ± 0.1964
68 BHOKSHA BKS UTTAR PRADESH North Tibeto-Burman/ TRIBE 0.6222 ± 0.1383
Indo-European
69 UP KURMI UPK UTTAR PRADESH North Indo-European CASTE Lower 0.0000 ± 0.0000
70 THARU THR UTTAR PRADESH North Tibeto-Burman/ TRIBE 0.8000 ± 0.1721
Indo-European
71 JAUNSARI JUS UTTAR PRADESH North Tibeto-Burman/ TRIBE 0.7333 ± 0.1552
Indo-European
72 MAHISHIYA MSY WEST BENGAL East Indo-European CASTE Middle 0.8684 ± 0.0489
73 NAMASUDRA NMS WEST BENGAL East Indo-European CASTE Lower 0.9000 ± 0.0355
74 BAURI BAU WEST BENGAL East Indo-European CASTE Lower 0.6526 ± 0.0648
75 MAHELI MHL WEST BENGAL East Austro-Asiatic TRIBE 0.8211 ± 0.0586
76 KARMALI KRM WEST BENGAL East Austro-Asiatic TRIBE 0.2924 ± 0.1274
77 KORA KOR WEST BENGAL East Indo-European TRIBE 0.7579 ± 0.0495
78 LODHA LOD WEST BENGAL East Austro-Asiatic TRIBE 0.8421 ± 0.0595
79 EZHAVA HINDU EZH KERALA South Dravidian CASTE Lower 0.8056 ± 0.0889
80 NAIR NAR KERALA South Dravidian CASTE Upper 0.0000 ± 0.0000
101
102

Table 2a: Comprehensive haplogroup frequency data among linguistic, geographic and social categories of India
Sample Size C D F* G H* H1 H2 J2* K* K2 L L1 M N O*
TOTAL INDIA 1152 0.014 0.004 0.030 0.001 0.069 0.159 0.002 0.051 0.038 0.031 0.045 0.010 0.000 0.000 0.003
Language
INDO-EUROPEAN 518 0.012 0.004 0.027 0.002 0.079 0.183 0.002 0.058 0.029 0.033 0.035 0.002 0.000 0.000 0.000
DRAVIDIAN 393 0.020 0.000 0.048 0.000 0.089 0.209 0.003 0.056 0.041 0.043 0.084 0.028 0.000 0.000 0.000
AUSTRO-ASIATIC 140 0.014 0.000 0.014 0.000 0.021 0.043 0.000 0.050 0.043 0.014 0.007 0.000 0.000 0.000 0.007
TIBETO-BURMAN 101 0.000 0.030 0.000 0.000 0.010 0.000 0.000 0.000 0.069 0.000 0.000 0.000 0.000 0.000 0.020
Geography
NORTH 180 0.000 0.006 0.011 0.006 0.106 0.139 0.000 0.078 0.000 0.000 0.017 0.000 0.000 0.000 0.000
WEST 135 0.037 0.000 0.007 0.000 0.081 0.356 0.007 0.081 0.000 0.000 0.096 0.000 0.000 0.000 0.000
EAST 357 0.011 0.000 0.048 0.000 0.056 0.106 0.000 0.036 0.053 0.048 0.020 0.003 0.000 0.000 0.003
NORTH-EAST 108 0.000 0.037 0.000 0.000 0.009 0.000 0.000 0.000 0.083 0.000 0.000 0.000 0.000 0.000 0.019
SOUTH 372 0.019 0.000 0.040 0.000 0.078 0.194 0.003 0.056 0.043 0.051 0.078 0.030 0.000 0.000 0.000
Social Hierarchy
UPPER CASTE 211 0.009 0.005 0.019 0.000 0.043 0.185 0.005 0.100 0.024 0.000 0.095 0.019 0.000 0.000 0.000
MIDDLE CASTE 175 0.006 0.000 0.051 0.000 0.040 0.171 0.000 0.097 0.040 0.017 0.034 0.023 0.000 0.000 0.000
LOWER CASTE 261 0.008 0.000 0.046 0.000 0.107 0.169 0.000 0.031 0.050 0.046 0.054 0.000 0.000 0.000 0.000
TRIBES 505 0.022 0.008 0.020 0.002 0.071 0.139 0.002 0.026 0.038 0.042 0.024 0.008 0.000 0.000 0.006
Sample Size O2a O2a1 O3 O3e P* R* R1 R1a R1a1 R1b3 R2
TOTAL INDIA 1152 0.149 0.001 0.001 0.026 0.027 0.010 0.011 0.002 0.175 0.005 0.135
Language
INDO-EUROPEAN 518 0.010 0.000 0.002 0.006 0.039 0.021 0.008 0.004 0.297 0.012 0.137
DRAVIDIAN 393 0.023 0.000 0.000 0.000 0.008 0.000 0.023 0.000 0.117 0.000 0.209
AUSTRO-ASIATIC 140 0.729 0.000 0.000 0.000 0.036 0.000 0.000 0.000 0.007 0.000 0.014
TIBETO-BURMAN 101 0.554 0.010 0.000 0.267 0.030 0.000 0.000 0.000 0.010 0.000 0.000
Geography
NORTH 180 0.000 0.000 0.006 0.017 0.000 0.011 0.000 0.006 0.483 0.006 0.111
WEST 135 0.000 0.000 0.000 0.000 0.030 0.022 0.000 0.000 0.193 0.000 0.089
EAST 357 0.325 0.000 0.000 0.000 0.045 0.014 0.006 0.003 0.104 0.000 0.120
NORTH-EAST 108 0.519 0.009 0.000 0.250 0.046 0.000 0.009 0.000 0.019 0.000 0.000
SOUTH 372 0.000 0.000 0.000 0.000 0.016 0.003 0.027 0.000 0.134 0.013 0.215
Social Hierarchy
UPPER CASTE 211 0.000 0.000 0.000 0.000 0.019 0.014 0.005 0.005 0.360 0.005 0.090
MIDDLE CASTE 175 0.000 0.000 0.000 0.000 0.029 0.011 0.029 0.000 0.263 0.000 0.189
LOWER CASTE 261 0.004 0.000 0.000 0.004 0.023 0.004 0.023 0.000 0.157 0.000 0.276
TRIBES 505 0.339 0.002 0.002 0.057 0.032 0.010 0.002 0.002 0.077 0.010 0.061
R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.
Table 2b: Comparative haplogroup frequency data in socio-ethnic groups of India
Tribe Dravidian Caste Indo-European Caste
AA DR TB IE Upper Middle Lower Upper Middle Lower Caste_DR Caste _IE
No. of populations 11 8 7 8 5 4 10 9 8 8 19 25
Sample size. 179 126 92 108 72 58 137 132 117 115
Total (in %)
C 1.1 5.6 1.9 0.7 1.5 0.9 0.9 0.4 1.1
D 3.3 0.9
F* 2.2 3.2 1.9 2.8 10.3 5.1 1.5 2.6 4.3 5.6 2.7
G 0.9
H* 2.8 12.7 13.9 5.6 5.2 8.8 3.8 3.4 13.0 7.1 6.6
H1 7.3 22.2 26.9 31.9 10.3 18.2 12.1 20.5 16.5 20.2 16.2
H2 0.9 1.4 0.4
J2* 3.9 3.2 1.9 8.3 15.5 2.2 11.4 6.8 4.3 6.7 7.7
K* 3.4 7.1 4.3 4.2 1.7 2.2 5.1 6.1 2.6 3.6
K2 2.2 11.1 2.8 5.2 10.4 1.1 3.3
L 0.6 4.8 4.6 15.3 5.2 9.5 6.8 2.6 0.9 10.1 3.6
L1 3.2 4.2 6.9 0.8 2.6 0.3
O* 0.6 2.2
O2a 57.0 7.1 59.8 4.6
O2a1 1.1
EARLIEST SETTLERS OF INDIAN SUBCONTINENT

O3 0.9
O3e 28.3 2.8
P* 5.6 0.8 4.6 1.7 0.7 1.5 3.4 1.7 0.7 2.2
R* 1.7 1.9 2.3 1.7 0.9 1.6
R1 0.9 5.2 4.4 1.7 3.4 0.5
R1a 0.6 0.8 0.3
R1a1 0.6 11.9 1.1 20.4 15.3 10.3 10.2 48.5 34.2 23.5 11.6 36.0
R1b3 4.6 0.8 0.3
R2 10.6 7.1 2.8 11.1 22.4 38.0 8.3 17.1 17.4 27.3 14.0

Table 3: Y-Chromosome microsatellite diversity within the twelve major lineages found in India
P F H O3 O2a L K K2 R1a1 C J2 R2
No of Males 29 24 221 26 162 55 36 36 191 16 52 136
No of Haplotypes 27 24 206 24 144 53 36 36 188 16 51 131
Lineage diversity 0.995± 1.000 ± 0.999 ± 0.993 ± 0.997 ± 0.998 ± 1.000 ± 1.000 ± 0.999 ± 1.000 ± 0.999 ± 0.999±
0.010 0.012 0.000 0.012 0.001 0.003 0.006 0.006 0.000 0.022 0.004 0.001
MPD 14.19 ± 13.91± 12.66± 7.37 ± 11.55± 13.13 ± 12.93 ± 14.18 ± 12.05± 13.45± 14.04 ± 13.04±
6.55 6.47 5.73 3.56 5.26 6.00 5.96 6.51 5.47 6.38 6.40 5.90
Av. gene diversity 0.709 ± 0.695 ± 0.633± 0.368 ± 0.577± 0.691 ± 0.718 ± 0.709 ± 0.602 ± 0.707 ± 0.702 ± 0.652±
0.364 0.360 0.317 0.198 0.291 0.350 0.368 0.361 0.302 0.376 0.355 0.327
Haplotype sharing
Within 2 Non 10 2 10 2 Non Non 3 Non 1 2
Between Non Non 2 Non Non Non Non Non Non Non Non 1
103
104 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

(0.569 ± 0.104). The distribution of the lineages the Indians. The Indo-European speakers
and Y-STR diversity within the haplogroups are demonstrated a significantly higher proportion
described in detail. of this lineage as compared to populations
belonging to Dravidian linguistic family (29.7%
Haplogroup H-M69 vs. 11.7%; χ2= 7.82, p<0.05) (Table 2a). With the
exception of Lodha, Nepali and Bhutia, all other
Majority of males analyzed from different Austro-Asiatic and Tibeto-Burman speakers lack
geographic regions of India (23%) carried the this haplogroup in their Y-chromsomes. While the
M69C haplotype, which is additionally defined Indo-European and Dravidian caste group depict
by M52C mutation. Distribution of Haplogroup significant variation (χ2= 12.5, p<0.01), the tribal
H showed a north-south gradient (24.4% to groups are more akin. Distribution of M17 lineage
27.4%), however geographically its total also showed a decreasing geographic cline along
frequency was highest (44.4%) in populations of the latitude; its frequency was highest (approx.
western India (Table 2a). Among the 23% carrying 50%) among the populations of Bihar and Uttar
H lineage in their Y-chromosomes, most of them Pradesh, where almost 60% of the Upper and
were representatives of south India (27.4%), Middle caste groups harbored R1a1 Y–chromo-
speaking Dravidian languages (30.0%). In socio- somes. Out of the 191 males that carried R1a1
ethnic groups, the frequency was 27.6% in lower haplogroup, 188 unique 20-YSTR haplotypes
caste groups, while in the tribal groups it were observed (h=0.999). While no haplotype
accounted for 21.2% of their paternal variation. was shared between populations, intra-
However, the pattern of distribution did not vary population variation was observed within Jat,
statistically between the Dravidian and Indo- Bhoksha and Yerukula and the mean pairwise
European speaking tribes or caste cluster (Table difference between all the Y-STRs was found to
2b). 206 distinct 20-Y-STR haplotype profiles be high (12.05) (Table 3). The median–joining
deciphered out of 221 individuals carried a mean network analysis, however, revealed that
pairwise difference of 12.66 (Table 3), where none populations of neighbouring area shared few of
of the hapotypes were shared between groups. the haplotypes (Fig. 2b). Passarino et al 2002
In the median–joining network analysis with 7-Y- reported two region specific allele pattern
STRs associated with M69C/ M52C lineage associated within M17 among Europeans;
branch, majority of Y-chromosome STR DYS19=15 and YCA IIa,b=19,21 was specific to
haplotypes are connected by one-or two-step the R1a1s in Western Europe, while Eastern
mutation events (Fig. 2a). Kora, an Indo-European European R1a1s typically harbored allele16 for
speaking tribal group from eastern India, branched DYS19 and 19,23 for YCA IIa,b. In our dataset,
out of the network with more than three mutation although, allele 15 and 16 at DYS19 were the two
steps. The Y-STR based coalescence time of most common alleles with significant difference
haplogroup H1 chromosomes was estimated to in their frequency (χ2= 4.66, p<0.05), they did
be ~ 43,556 years (Table 4). not reveal any specific geographical, socio-
ethnic or linguistic pattern in their distribution.
Haplogroup R1a1-M17 The TMRCA of those individuals harboring
R1a1 Y-chromosome is estimated around ~32 KYA
Haplogroup R1a1-M17 characterizes 17.5% of (Table 4).

Table 4: Y-Chromosome haplogroup variances and TMRCA estimated on seven Y-STR loci
Locus -wise variance R1a1 H C O2a R2
DYS19 0.549 0.497 1.183 0.293 0.770
DYS3891 0.793 0.760 0.783 0.383 0.649
DYS3892 0.960 0.872 1.400 0.620 1.211
DYS390 1.173 2.246 1.983 2.314 1.375
DYS 391 1.081 2.533 1.762 0.791 0.966
DYS392 1.009 0.708 1.662 1.620 1.749
DYS393 0.710 0.921 0.917 0.995 1.051
Average 0.896 1.220 1.384 1.002 1.110
Age estimates in years 32,015.31 43,556.12 49,438.78 35,795.92 39,647.96
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 105

Arabian Sea

Indian Ocean

Fig. 1. Y-Chromosome haplogroups and their frequency distribution in different regional


populations of India

Haplogroup O2a-M95 Nepali, Bhutia, Tharu, Jaunsari and Bhoksha


harbor this haplogroup (Fig. 1). Although the Y-
Haplogroup M95, which forms the major chromosomes were rather similar (FST =0.03,
South-East Asian male lineage, (Su et al. 2000, p<0.05), none of the Y-STRs were shared between
Karafet et al 2001) accounts for 15% of the Y- groups (Table 3). A comparison of Indian M-95 Y-
chromosome variation in India. It is however, STR haplotypes with populations of SE Asia
localized to the eastern part of the subcontinent, including Java, Borneo, Taiwan and Malay
restricted among the Austro-Asiatic speakers revealed that the Austro-Asiatic speakers of
(72.9%) and Tibeto-Burman speaking tribes Indian subcontinent showed closer affinity to the
(56.4%) of NE India (Table 2a). Although this SE Asians than their Tibeto-Burman speaking
haplogroup was also detected in Indo-European neighbors (FCT =0.43 vs 4.15, respectively) (data
and Dravidian speakers (3.3% in total), its not shown). To further investigate the
presence in them could be sufficiently attributed relationships between O2a Y-chromosome in the
to admixture from Austro-Asiatic speaking Austro-Asiatic and Tibeto-Burman speakers, a
neighbors living in close vicinity. While this median-joining network of 27 discrete haplotypes
lineage is completely fixed in Juang, Ho and of 151 individuals was constructed (Fig. 2c). This
Santhal, it is observed that the frequency in network exhibited two distinct clusters of
Tibeto-Burman tribes varied from 25% in Kuki to haplotypes with considerable haplotype sharing
80% in Hmar. Surprisingly however, none of the between the two linguistic families; however the
Himalayish branch of Tibeto-Burman speakers; Austro-Asiatic speakers depicted more diverse
106 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

haplotypes compared to the Tibeto-Burmans. The d=1, p=0.05). While frequencies were rather
TMRCA of all M95T chromosomes was estimated comparable in the lower caste groups of north
to be ~35,795 years (Table 4). and south, middle and upper caste populations
of south India demonstrated relatively higher
Haplogroup R2-M124 frequencies than northern caste groups (Table
2a). The L-network and high MPD (13.13) revealed
Our analysis revealed that haplogroup R2 results in congruence with AMOVA suggesting
characterizes 13.5% of the Indian Y-chromosomes that no clear geographic, linguistic or social
and its frequency among Dravidian speakers was pattern could be discerned among the Y-STR
comparable to that of haplogroup H (20.9%) and haplotypes (data not shown).
significantly different from Indo-European and
Austro-Asiatic speakers (χ2= 16.2, d=3, p<0.05). Haplogroup J2-M172
While the distribution across various geographic
regions was almost uniform, significant differen- Haplogroup J2-M172 is the major lineage of
tiation was observed along the social groups (χ2= Middle East/Mediterranean and its frequency
18.7, d=3, p<0.05); a decreasing gradient was decreases into Europe. Among the studied Indian
discernible as one moved up the caste hierarchy populations, M172G exhibited a total frequency
(Table 2a). Although tribes contributed only 7.4% of 5.1%, where it was uniformly distributed among
of the total R2 lineage, it was proportionately the three major linguistic families. Except for the
distributed between the Austro-Asiatic and Tibeto-Burman speaking tribes of northeast India,
Dravidian tribes (Table 2b). Extensive analysis of where this lineage was totally absent, no specific
its distribution between north and south Indian cline could be deciphered among the other
populations showed that while there was seventy-three mainland populations. In the social
marginal difference among middle and lower caste categories, upper and middle caste populations
groups of north India (17.1 and 17.4 % harbor a significantly higher percentage of J2
respectively), a clear gradient was observed lineage (~ 10%) as compared to ~3% in lower caste
among south Indians, where the frequency populations and tribal groups (χ2= 7.74, d=3,
declined by more than one-half from lower to p=0.05). While the distribution was proportionate
upper caste groups. Analysis of 20-Y-STRs within among upper caste groups of north and south
the R2 lineage revealed that three haplotypes India, the difference was more discrete (6.8% vs
were shared; one between Kamma Chaudhary and 15.5%) among middle caste groups of these
Kappu Naidu, both lower caste Dravidian regions (Table 2b). The Near Eastern populations
speakers from Andhra Pradesh and two within harboring M172 Y-chromosomes are charac-
Karmali and Pallar populations. Network analysis terized by a very high frequency of DYS388 alleles
(Fig. 2d) depicted that a large number haplotypes with e”15 repeats, while more than 70% of the
were shared between populations of south India, males examined under this study displayed alleles
while the populations of eastern India harbored with repeat motifs d”14. Network analysis showed
more discrete Y-STR haplotypes. The TMRCA a large number of divergent haplotypes even with
for M124T was estimated to be ~39,647 years 7-YSTRs and only two reticulations. Lodha, the
(Table 4). only Austro-Asiatic tribe that harbored J2 lineage
also displayed very diverse haplotype profiles.
Haplogroup L-M11 The genetic structure estimated with AMOVA
showed that although the extent of Y-STR
The overall frequency of haplogroup L-M11 differentiation in the populations carrying this
in the Indian populations was estimated to be lineage was approximately 3%, geography,
5.6%, while sporadic occurrence of this lineage language or position in the social hierarchy could
has earlier been described among Indo-European not statistically delineate the studied Indian
speakers of Caucasus, Middle East, Europe and populations. Because this marker is associated
a maximum of 4.3% in Central Asia (Semino et al. with the spread of agriculture, we further estimated
2000; Wells et al. 2001). Dravidian speaking popu- variance in the agricultural groups and observed
lations harbored a significantly higher percentage a marginal difference of 1% in Y-chromosomes of
of L haplogroup compared to the Indo-European landowner and labourer communities harboring
speakers, 11.2 and 3.7% respectively (χ2= 3.77, the J2 lineage (data not shown).
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 107
Haplogroup O3-M122 and O3e-M134 Underhill et al. 2001). In the present study, this
lineage was found spread all along the coastal
The M122 haplogroup and its sub-lineage belt in populations of Maharastra, Tamilnadu,
M134 are among the predominant and widespread Andhra Pradesh, Orissa and West Bengal, at an
lineages of East Asia (Su et al. 2000; Karafet et al. average frequency of 1.4%, and noticeably absent
2001; Shi et al. 2005). In the Indian males, it was in the populations of North and North-East.
detectable at frequencies less than 3% and was Although this lineage was present in high
largely restricted among the Tibeto-Burman frequency in tribes compared to caste groups, its
speakers of North-East. It was sporadically distribution in them was not statistically
present among tribal groups of north India, significant (χ2= 2.25, d=1, p>0.05). We observed
particularly Tharu (Fig. 1), probably due to recent 16 discrete Y-STR haplotypes, with mean pairwise
admixture with neighboring Tibeto-Burman difference between haplotypes of 13.45 (Table 3).
speakers of Nepal and China. A clear delineation Although the total number of individuals carrying
along the language family was observed in its RPS4Y711T was too small (n=16) to make accurate
distribution; where it was completely absent evolutionary inferences about its origin within
among the Tani speakers (Adi Pasi), while the South Asia, the TMRCA of Indian RPS4Y711T
Naga-Kuki-Chin branch of Tibeto-Burman individuals was estimated to be ~ 49,438 years
speakers contributed the entire 26.7% of O3 (Table 4).
lineage. The mean pairwise difference between
Y-STRs and the lineage diversity was low at 7.37 Other Haplogroups Observed in Indians
and 0.993, respectively, compared to its sister
clade O2a. Haplogroup F, which is major and the most
paraphyletic subcluster of M168 lineages was
Haplogroup K2-M70 ubiquitous in its distribution along geographic,
linguistic and socio-ethnic boundaries of India
Haplogroup K2 occurs on a M9G background and was observed in approximately 3% of the
and is reported to occur in populations of Near studied males (Table 2a). Haplogroup D, a
East and Europe (Underhill et al. 2000). In our monophyletic branch of M168 lineage, defined
study, it was found only in the eastern and by an Alu insertion and M174C mutation, on the
southern regions of the country, adding to an other hand, was restricted in Bhutia and Tharu
overall frequency of 3.1%. Although it was present tribal groups. Its presence among them is most
in the three major linguistic families, the statistical likely due to gene flow from Tibet, where this
difference in its distribution was insignificant. haplogroup has earlier been reported. Major
However, its distribution depicted an inverse haplogroups K*, P*, R*, R1, R1a contribute
relation as one moved up the social ladder, with approximately 2-3% of the total Indian Y-
the upper caste populations completely lacking chromosomes and there was no difference in its
the M70 lineage in their Y-chromosomes (Table distribution pattern among castes or tribes or
2a). This lineage was predominant amongst the among different geographic regions. Although,
lower caste groups of east (Bauri) and tribal we detected a few European –specific haplo-
groups of south, particularly Yerukula, contri- groups G and R1b3 in Indians, none of our studied
buting approx. 60% of the total K2 chromosomes. samples showed the presence of haplogroup K3-
Within the lineage, 36 distinct Y-STR haplotypes, M147, N-M231 or I-M170, which are the other
with a very high mean pairwise difference (14.18) highly predominant haplogroups of Europe.
between them, depicted the absence of population
structure due to language, geography or ethni- Genetic structure of the Indian populations
city.
Analysis of molecular variance revealed that
Haplogroup C- M130 (RPS4Y711) the extent of genetic differentiation was high
among Indians; percent variation among different
The RPS4Y711T forms the second major cluster groups added up to 27.11%, suggesting that gene
in Asia and Australo-Melanesia and has pool of India males was highly structured. To
reportedly spread into North America (Wells et identify factor/(s) responsible for this compart-
al. 2001; Underhill et al. 2000; Karafet et al. 1999; mentalization of Y-chromosomes, population
108 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

Fig. 2a. Median-Joining network of H haplogroup individuals, based on seven Y-STR haplotypes.
Circles represent haplotypes and have an area proportional to frequency. Colour represents the four
geographic regions of India (Red: South; Blue: North; Green: West; Yellow: East)

Fig. 2b. Median-Joining network of R1a1 haplogroup individuals, based on seven Y-STR haplotypes.
Circles represent haplotypes and have an area proportional to frequency. Colour represents the four
geographic regions of India (Red: South; Blue: North; Green: West; Yellow: East)
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 109

Fig. 2c. Median-Joining network of O2a haplogroup individuals, based on seven Y-STR haplotypes.
Circles represent haplotypes and have an area proportional to frequency. Colour represents the Austro-
Asiatic (Black) and Tibeto-Burman (White) linguistic families of India

Fig. 2d. Median-Joining network of R2 haplogroup individuals, based on seven Y-STR haplotypes. Circles
represent haplotypes and have an area proportional to frequency. Colour represents the four geographic
regions of India (Red: South; Blue: North; Green: West; Yellow: East)
110 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

genetic structure was analyzed on haplogroup ships among India populations, pairwise FST
frequency data in a hierarchical mode: within distances were estimated on the Y-haplogroup
populations, among populations and among frequencies. On the whole, populations clustered
groups of populations, pooled according to according to their Y-chromosome lineages. Two
geography, linguistic family and their socio-ethnic distinct clusters of Indo-European and Dravidian
position (Table 5). The amount of genetic variation speakers were discernible in the NJ tree on 87
among five major geographical regions was lesser Indian populations, where except for a few
than the percent due to variation among deviations most of the populations clustered
populations within regions (Öct= 0.096 and Ösc= within their linguistic family. Austro-Asiatic and
0.211, respectively). Further analysis revealed that Tibeto-Burman speakers harboring O2a Y-
almost 14.57% of among group variation was due chromosome lineage described a separate cluster,
to regional boundaries which also defined their while Tharu grouped with Mara, Lai and Kuki
linguistic affinities (language sub-families). The carrying O3e lineage, and formed a branch distant
increase in “among group variation” and “among from other Tibeto-Burman tribes (Fig. 3). MDS
populations within groups” was non-significant plot also substantiated the genetic proximity of
when only four major linguistic families were used Austro-Asiatic and Tibeto-Burman speakers to
as a criterion for grouping Indian populations. the populations of South East Asia. Most of the
High Fct and Fsc values suggest that although other Indian populations were closer to the Indo-
significant structuring occurs within the popu- European speakers of Central Asia and Eastern
lations of India, they could not be partitioned Europe (Russia and Siberia) but distant from
either geographically or linguistically. Apportion- populations of Western Europe, while popu-
ment of populations into two broad socio-ethnic lations of Middle East and Caucasus region
group; caste and tribes depicted only 8% diffe- formed a separate cluster in the MDS plot.
rence between them, which further decreased to Populations of Uttar Pradesh, Bihar and Punjab
6.63% when the caste populations were further were moderately distant from other Indo-
resolved into upper, middle and lower groups. European speakers, while those of Pakistan
Among themselves, the caste populations were remained between Indians, Central Asia and
not very different, harboring only 1.6% variation Russia (Fig. 4).
among them.
DISCUSSION
Genetic Relationships among the Indians and
with World Populations This comprehensive study of Y-chromosome
diversity within India aims to identify evolu-
To investigate the extent of genetic relation- tionary events (founder effects, gene flow and

Table 5: Genetic Differentiation in Indians at different levels of hierarchy based on Y-SNP Data
Within Population Among Population Among Groups
Within Groups
No. of Groups % FST* % FSC* % FCT*
Total 72.89 0.271 27.11
Geography 5 71.22 0.287 19.09 0.211 9.69 0.096
Regional 14 71.95 0.28 13.48 0.157 14.57 0.145
Language 4 69.16 0.308 15.53 0.183 15.31 0.153
a 4 69.88 0.301 17.17 0.197 12.96 0.129
Social CS vs TR$ 2 69.79 0.302 21.73 0.237 8.48 0.084
b 4 71.49 0.285 21.89 0.234 6.63 0.066
c 4 82.29 0.177 16.21 0.164 1.51 0.015
Castes UP vs MD vs LW# 3 83.73 0.162 14.65 0.148 1.61 0.016
*All values are statistically significant at p<0.05
$
CS: Castes: TR: Tribes
#
UP: Upper castes; MD: Middle castes; LW: Lower Castes
a: Includes Karmali and Maheli under Austro-Asiatic; Tharu under Tibeto-Burman language family
b: Includes Upper, Middle, Lower Castes and Tribes
c: Includes Upper, Middle, Lower Castes and excludes Austro-Asiatic and Tibeto-Burman Tribes
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 111

Indo-Europeans

Tibeto-Burmans

Dravidians

Austro-Asiatics/
Tibeto-Burmans

Fig. 3. Genetic relationship among populations of India based on F ST distances


estimated on Y-Haplogroup frequencies
112 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

Fig. 4. Genetic relationship between populations of India and world estimated from
Y-Chromosome haplogroup frequencies represented in MDS plot

genetic drift) and factors (geographical, linguistic by anatomically modern humans. Four
and cultural barriers) that might have produced a haplogroups; H= 23%; R1a1=17.5%; O2a=15%
high degree (27%) of genetic differentiation among and R2=13.5%, form major paternal lineage of
the Indian patrilines. Here we also evaluate some Indians and together account for ~70% of their
of the suggested theories of occupation of Indian Y-chromosomes. Being largely restricted to the
subcontinent by modern humans and population Indian subcontinent, haplogroup H is assumed
histories, in the light of current molecular genetic to be associated with the eastward expansion of
evidences. M89 Y-chromosomes from the Leventine corridor,
which also carried the two late Pleistocene mt
Phylogeography of Indian Y-Chromosomes DNA haplogroups, U2 and U7 into India.
Although the M69 Y-chromosomes are
India is a relict area, which is likely to have particularly predominant among the Dravidian
served as an incubator during the early dispersal speakers of south India, its fairly uniform
of modern humans out of East Africa (Quintana- distribution across different regions and socio-
Murci et al. 1999; Cann 2001) and a treasure-house ethnic groups of India suggests deep time depth
of ancient population genetic signatures in its for these lineage clusters.
gene pool. This is reflected in the 24 different Based on the predominance of M17 lineage
haplogroups which were observed in the present among diverse linguistic families (Indo-European,
Y-chromosome analysis of 1434 Indian males. Altaic, Uralic and Caucasian) and geographic
Overall haplogroup diversity among Indian regions (Central Asia, Europe, Caucasus, Middle
populations was relatively high (0.893) in contrast East), (Wells et al. 2001; Underhill et al. 2000;
to other European or East Asian populations, but Karafet et al. 1999; Rosser et al. 2000; Nebel et al.
was closer to that of Central Asia. This pattern of 2000) it has been associated with the Kurgan
high NRY diversity (Y-SNP and Y-STR) indicates culture, domestication of horses and spread of
an early settlement of the Indian subcontinent Indo-European languages, all which supposedly
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 113
originated in southern Russia/Ukraine and chromosome lineage, portraying a distinctly
subsequently extended to Europe, Central Asia different history.
around 3000 B.C (Wells et al. 2001). Its presence On an average, the patterns of NRY
in India has been linked to the “Aryan” migration haplogroup variation of Indians reflect that
and subsequent spread of Indo-European populations of the subcontinent are not very
languages, appearance of iron and Painted Grey distinct from each and probably have share a few
Ware culture in North West frontier (Cavalli-Sforza common paternal ancestors. The lineage diversity
et al. 1994). However, antiquity and geographic was small for Austro-Asiatic and Tibeto-Burman
origin of this lineage still remains contentious. speakers and most of them harbored single
Our study reveals that this lineage is present in a lineage, indicating a founder paternal source for
significant proportion among the Indian Indo- these endogamous groups, which are confined
European speakers, and is proportionately high to the eastern and north-eastern regions of India.
among upper caste groups (Table 2a). The Haplogroup diversities were rather high for
dispersion of this lineage into the southern tribal populations of south India (average of 0.740),
groups (Kivisild et al. 2003, Cordaux et al. 2004) giving concordant evidences of a relative early
and the fact that it is proportionately distributed settlement, growth and expansion of populations
between the Dravidian and Indo-European tribal living in southern India.
groups provides significant evidence against any
major influx of Indo-European speakers that could Traces of Ancient Migration of Modern Humans
have drastically changed the Indian male gene
pool (Sahoo et al. 2006; Sengupta et al. 2006). Recent studies provide substantial evidences
The high average STR variance (0.896) and in favor of the southern route hypothesis for the
TMRCA supports a rapid population growth and dispersal of modern human ~ 60-75 kya from the
expansion of M17 Y-chromosomes, which horn of Africa along the tropical coast of Indian
contributed M17 lineages both to Central Asian Ocean to reach insular South East Asia and
nomads and South Asian tribes much before the Oceania (Cann 2001; Stringer 2000). A strong Y-
Indo-European introgression into India. Another chromosome support to this model is the
sub-lineage of M173, R1b3-M269 is present at distribution of haplogroup C lineages in Asia
appreciable frequencies 14.5% in Turkey (Kivisild et al. 2003). Australo-Melanesia and
(Cinnioglu et al. 2004) and at considerable North America (Karafet et al. 1999). Although
frequency in Europe (Cruciani et al. 2002), while it present in low frequencies in Indian subcontinent,
is detected at relatively low frequency (1.9%) in (Bamshad et al. 2001; Sengupta et al. 2006; Kivisild
India, substantiating a recent and limited et al. 2003; Cordaux 2004; Wells 2006; Ramana et
admixture with west Europeans. al. 2001) it is largely distributed along the coastal
The observed high frequency of R2 Y- regions, with a few patchy occurrences in Punjab.
chromosomes in Indians, which is equivalent to However, the persistence of M130 lineages mostly
that of haplogroup H among Dravidian speakers, among the south Indians, Pakistan (Qamar et al.
corroborates previous reports suggesting its 2002) and Sri Lanka (Kivisild et al. 2003) provides
Indian origin (Cordaux et al. 2004). The deep indirect evidence in support of the southern route
coalescence time for R2 lineages, dating back to of migration by early modern humans. We have
Late Pleistocene, supports its indigenous origin. previously suggested that lack of haplogroup C
Outside India, it is found in Iran and Central Asia sub-lineages (M217, M38 and M8) is indicative
(3.3%) and among Roma Gypsies of Europe, of the indigenous origin of most Indian
known to have historical evidence of their populations and argues against the theory of
migration from India (Wells et al. 2001). Within Aryan migration from Central Asia (Sahoo et al.
India, while it is predominant in both eastern and 2006). The present analysis showed none of the
southern regions, its distribution pattern is rather deletions (DYS390.1 or DYS 390.3) associated
patchy in east (Sahoo et al. 2006). It is most likely with Australian or Polynesian C* chromosomes,
that genetic drift or bottleneck has reduced the contesting the claims of link between India and
paternal diversity of Karmali, which contributes Australian aboriginals (Redd et al. 2002). . The
28% of the eastern R2 lineages. This population age estimate of approximately 49KYA years in
although considered to be Austro-Asiatic Indian samples indicates a probable Indian origin
speaker, does not present any evidence of O2a Y- of this lineage. However, until further analysis
114 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

and age estimates in other world populations are of J2 and R1a1 among caste populations would
known, it cannot be conclusively proven if probably provide a simplistic assumption that
RPS4YT mutation arose in India or arrived with agriculture was brought along with caste system
the earliest migrants after it arose somewhere in by the Indo-European speakers as a result of
west Asia, from where it was finally lost or diluted demic diffusion of early farmers from southwestern
during the Upper Paleolithic expansion of modern Iran, Fertile Cresent and Anatolia (Quintana-
humans (Underhill et al. 2001). Murci et al. 2001; Cordaux et al. 2004). However,
the absence of other Neolithic markers of early
Genesis of Caste Structure and Influence of farmers, M35 and M201; that are prevalent in
Migrations on the Indian Gene Pool Europe, Anatolia, South Caucasus and Iran
(Semino et al. 2000; Underhill et al. 2001) among
The main feature of Indian society is that it is Indians, in addition to the frequency of M172 in
highly structured by social factors such as caste southern and western India and its persistence
system, in which birth determines the position in in south India and tribal groups (Table 2a) ques-
the society, mode of subsistence (occupation), tions the validity of this hypothesis. Agriculture
and choice of marriage partners. However, in India probably arose as two independent
genesis of caste system in India is ambiguous, events; one that was a consequence of earliest
since many of the caste groups are known to migration that brought the Dravidian speakers
have tribal origins (Kosambi 1964). Further the and another much later through spread of rice
migration of Greeks, Huns, Arabs, Chinese, Turks, cultivators from SE Asia (Fuller 2003; Diamond
Persians, Portuguese and others have made et al. 2003).
understanding the nature of population structure
more complex. mt DNA analysis from different Insights into Origin of Austro-Asiatic and
geographic region and social status showed that Tibeto-Burman speakers
maternal haplogroups in India are derived from a
limited number of founder lineages of M and N The origin of two language families, Austro-
clades supporting a common proto-Asian Asiatic and Tibeto-Burman in India is of particular
ancestry with limited gene flow from later migrants interest and has received considerable attention
(Kivisild et al. 2003; Basu et al. 2003). Our study (Basu et al. 2003; Cordaux et al. 2004). In the
reveals that there is virtually no genetic difference present study, analysis of eleven Austro-Asiatic
in the Y-chromosomes between the caste groups and seven Tibeto-Burman tribes from the eastern
and tribes (Table 5). Whatever minor difference and north-eastern region of India, establishes that
is present is largely due to haplogroup O2a, the male gene pool of these groups are distinctly
contributed exclusively by the Austro-Asiatic different from other mainland tribal populations
and Tibeto-Burman tribes. Our present analysis (Table 2a). An overall low Y-STR haplotype
(AMOVA and haplogroup frequency distribution diversity and complete fixation of O2a in some of
in populations excluding the Austro-Asiatic tribes the aboriginal tribes (Ho, Santhal, Juang, Birhor
of Jharkhand and Orissa and Tibeto-Burman and Munda) suggests that these tribes probably
tribes from Northeast) provides congruent experienced a major demographic event, such as
evidence in support to the hypothesis that common founder effect followed by a bottleneck
populations in India largely derive their gene pool that greatly reduced the Y-chromosome diversity
from the common Pleistocene settlers. High in the Austro-Asiatic tribes of eastern India (Table
frequency of J2 and R1a1 lineages mirror a greater 1). In contrast to Austro-Asiatic tribes, the Tibeto-
influence of Indo-European migrants on upper Burman tribes harbor both O2a and O3e lineages
caste populations of Gangetic plains compared in their Y-chromosomes. Interestingly, the two
to the peninsular southern regions. However, branches of the Tibeto-Burman language;
these skewed frequencies also suggest that the Himalayish and Naga-Kuki-Chin, could be
indigenous populations received limited external distinctly identified from their Y-chromosomes.
gene flow from Europe, Central and West Asia. The former depicts influence of Tibetan gene pool,
This is also supported by mt DNA haplogroups marked by the presence of Haplogroup D lineages
that depict Indian-specific lineages with a limited in Bhutia and Tharu (Sahoo et al. 2006), while the
contribution from both west and east Eurasian other linguistic branch harbors O3e lineages. The
populations (Metspalu et al. 2004). A similar trend predominance of O haplogroup and its sub-
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 115
lineages in populations of East Asia suggest a ethnic affiliations, however, suggests people of
SE Asian origin of Indian Austro-Asiatic and south India as the original settlers of the sub-
Tibeto-Burman speakers. We hypothesize that continent. The total lineage diversity and dis-
the Tibeto-Burman speakers came as a number of tribution of Indian-specific Y-chromosome
migratory events, while the Austro-Asiatic tribes haplogroups (H, L, C, R1a1 and R2) in different
probably arrived in India as a single event. The geographical and socio-linguistic layers of the
two groups probably migrated into India at Indian populations provides substantial support
different time period is evident from the absence in favor of this hypothesis. This theory also
of O3e lineages among Austro-Asiatic speakers, gathers adequate evidence from presence of the
which probably are the earliest immigrants of the coastal marker, RPS4Y, in the south Indian tribes,
two. Presence of an Austro-Asiatic speaking tribe, who probably represent remnants of the modern
Khasi, among the Tibeto-Burman speaking human migration out of Africa that took the
neighbors in the northeast corroborates this southern route to Australia. Any possibility that
assumption. While the Tibeto-Burman speakers Austro-Asiatic speakers could have dispersed
brought in a number of East Asian maternal from India is also eliminated based on the
lineages (A, B5b, F1b, M8c, M8z) (Metspalu et differential distribution of O2a Y-chromosomes
al. 2004), absence of these lineages in Austro- in southern China and India and the complete
Asiatic tribes (Thangaraj et al. 2005; Sahoo 2006a) absence of East-Asian specific mt DNA lineages
portrays two different scenarios. First, the earlier in Austro-Asiatic and Dravidian speakers of India.
exodus from South East Asia was probably a major mt DNA haplogroups of Indian Austro-Asiatic
male–mediated migration into India, or that the speakers are instead, probably a sub-group of
female gene pool of the migrating East Asians is their Dravidian neighbors (unpublished data,
completely lost among the Austro-Asiatic tribes. Kashyap et al.). Recent archeological and linguis-
Additional confirmation to this hypothesis is tic evidences corroborate a Neolithic expansion
provided with evidences of agricultural expan- of Austro-Asiatic languages from Yangtze River
sions from their homelands in China, at different basin (Higham 2003) and our present study
times and over different geographic ranges. supports an east-west clinal expansion of Austro-
Austro-Asiatics are presumed to have spread Asiatic males from South East Asia, which was
west and south from southern China into the not associated with any female gene flow. Further,
Indian subcontinent and Malay Peninsula and deeper coalescence age for the Y-chromosome
brought rice cultivation with them (Higham 2003; haplogroups C, H, R2 compared to O2a is
Bellwood 2004). The genetic evidence revealed consistent with hypothesis that Austro-Asiatic
in this study is consistent with anthropological speakers cannot be considered as the earliest
records (Guha 1935), which suggests that Sino- settlers of South Asia.
Tibetans dispersed from the Yellow River and
came into India through two different routes; one CONCLUSIONS
from Burma probably brought the Naga-Kuki-Chin
language and O3e Y-chromosomes and the other We find that genetic variation in India is
from Himalayas, which carried the YAP lineages characterized by a high Y-chromosome diversity,
into northern regions of subcontinent. which is reflected by a greater correspondence
with linguistic groups of India. Our results
Age of Human Occupation in India- demonstrate India as a hotspot both as an
Austro-Asiatic or Dravidians as First Settlers? important source and recipient of major Y-
chromosome lineages of the world. Haplogroup
Based on socio-cultural and linguistic evi- distribution and AMOVA results provide tandem
dences (Thapar 1995; Pattanayak 1998) and results evidence in support a common Pleistocene origin
based on mt DNA HVSI nucleotide diversity and of Indian populations, which was subsequently
highest frequencies of mitochondrial M haplo- followed by migrations of Austro-Asiatic
group (Roychoudhary et al. 2001; Basu et al. speaking tribal males from SE Asia. The Tibeto-
2003), it was asserted that Austro-Asiatic tribes Burman populations were later migrants who took
are the earliest settlers in India. The present two different routes and carried both male and
comprehensive Y-chromosome analysis, which female lineages specific to East Asia. Based on
includes populations of all linguistic and socio- deep coalescence age estimates of H, R2 and C Y-
116 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

chromosome lineages, their diversity and Cann RL 2001. Genetic clues to the dispersal of human
distribution pattern, our data suggests an early populations: Retracing the past from the present.
Science, 291: 1742-1748
Pleistocene settlement of South Asia by Cavalli-Sforza LL, Menozzi P, Piazza A 1994. The History
Dravidian speaking south Indian populations; the and Geography of Human Genes. Princeton
Austro-Asiatic speakers migrated much later from University Press, Princeton pp 208-213
SE Asia and probably contributed only paternal Cinnioglu C, King R, Kivisild T, Kalfoglu E, Atasoy S,
Cavalleri GL, Lillie AS, Roseman CC, Lin AA, Prince
lineages while amalgamating with the aboriginal K, Oefner PJ, Shen P, Semino O, Cavalli-Sforza LL,
populations of the region. Underhill PA 2004. Excavating Y-chromosome
haplotype strata in Anatolia. Hum Genet, 114: 127-
ACKNOWLEDGEMENTS 148.
Cordaux R, Aunger R, Bentley G, Nasidze I, Sirajuddin
SM, Stoneking M 2004. Independent origins of
We express our appreciation to all the original Indian caste and tribal paternal lineages. Curr Biol,
donors who made this study possible. This study 14: 231-235.
was made possible through facilities provided at Cordaux R, Deepa E, Vishwanathan H, Stoneking M 2004.
CFSL, Kolkata. We acknowledge all researchers Genetic evidence for the demic diffusion of
agriculture to India. Science, 304: 1125.
whose valuable data was used for this study. The Cordaux R, Weiss G, Saha N, Stoneking, M 2004. The
SS, AS, JB, MT, SG, RR, RA are grateful to the northeast Indian passageway: a barrier or corridor
Directorate of Forensic Sciences, MHA for the for human migrations? Mol Biol Evol, 21: 1525-
Senior Research Fellowship. GHB and TS are 1533
Cruciani F, Santolamazza P, Shen P, Macaulay V, Moral
recipients of Senior Research Fellowship from P, Olckers A, Modiano D, Holmes S, Destro-Bisol G,
CSIR, India. This research was supported by a Coia V, Wallace DC, Oefner PJ, Torroni A, Cavalli-
financial grant to CFSL, Kolkata under the Xth Sforza LL, Scozzari R, Underhill PA 2002. A back
Five Year Plan of the Govt. of India. migration from Asia to sub-Saharan Africa is
supported by high-resolution analysis of human Y-
chromosome haplotypes. Am J Hum Genet, 70:
Electronic –Database Information 1197-1214.
Deraniyagala SU 1992. The Prehistory of Sri Lanka: An
URLs for the data mentioned in this article are Ecological Perspective. Colombo: Department of
The Archeological Survey, Government of Sri Lanka
as follows: Diamond J, Bellwood P 2003. Farmers and their languages:
XL STAT pro 7.5, http://www.xlstat.com The first expansions. Science, 300: 597-602.
Network 4.1, http://www.fluxus-engineering. Excoffier L, Smouse PE, Quattro JM 1992. Analysis of
com molecular variance inferred from metric distances
http://www.ethnologue.com among DNA haplotypes: application to human
mitochondrial DNA restriction data. Genetics, 131:
479-491.
REFERENCES Forster P, Rohl A, Lunnemann P, Brinkmann C, Zerjal
T, Tyler-Smith C, Brinkmann B 2000. A short
Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker tandem repeat-based phylogeny for the human Y
CE, Rao BB, Naidu JM, Prasad BV, Reddy PG, chromosome. Am J Hum Genet, 67: 182-196
Rasanayagam A, Papiha SS, Villems R, Redd AJ, Fuller D 2003. An agricultural perspective on Dravidian
Hammer MF, Nguyen SV, Carroll ML, Batzer MA, historical linguistics: archaeological crop packages,
Jorde LB 2001. Genetic evidence on the origins of livestock and Dravidian crop vocabulary. In: P
Indian caste populations. Genome Res 11: 994-1004 Bellwood, C Renfrew (Eds.): Examining the Farming/
Bandelt HJ, Forster P, Rohl A 1999. Median-joining Language Dispersal Hypothesis. McDonald Institute
networks for inferring intraspecific phylogenies. Mol for Archaeological Research, Cambridge. pp. 191-
Biol Evol, 16: 37-48 213.
Basu A, Mukherjee N, Roy S, Sengupta S, Banerjee S, Guha BS 1935. The racial affinities of the people of India.
Chakraborty M, Dey B, Roy M, Roy B, In: Census of India, 1931, Part III-Ethno-graphical
Bhattacharyya NP, Roychoudhury S, Majumder PP Higham C 2003. Languages and farming dispersals:
2003. Ethnic India: A genomic view, with special Austro-Asiatic languages and rice cultivation. In: P
reference to peopling and structure. Genome Res, Bellwood, C Renfrew (Eds.): Examining the Farming/
13: 2277-2290. Language Dispersal Hypothesis. McDonald Institute
Bellwood P 2004. Tracking the spreads of farming beyond for Archaeological Research, Cambridge
the fertile Crescent: Europe and Asia. In: First Farmers: James HVA, Petraglia MD 2005. Modern Human origins
The Origins of Agricultural Societies. pp 87 and the evolution of behavior in the later Pleistocene
Butler JM, Schoske R, Vallone PM, Kline MC, Redd AJ, Record of South Asia. Curr Anthropol, 46 Supp: S3-
Hammer MF 2002. A novel multiplex for simul- S27
taneous amplification of 20 Y chromosome STR Karafet T, Xu L, Du R, Wang W, Feng S, Wells RS, Redd
markers. Forensic Sci Int, 129: 10-24. AJ, Zegura SL, Hammer MF 2001. Paternal
EARLIEST SETTLERS OF INDIAN SUBCONTINENT 117
Population History of East Asia: Sources, Patterns sapiens from Africa through eastern Africa. Nat
and Microevolutionary Processes. Am J Hum Genet, Genet, 23: 437-441
69: 615-628 Ramana GV, Su B, Jin L, Singh L, Wang N, Underhill P,
Karafet TM, Zegura SL, Posukh O, Osipova L, Bergen Chakraborty R 2001. Y-chromosome SNP haplo-
A, Long J, Goldman D, Klitz W, Harihara S, de Knijff types suggest evidence of gene flow among caste,
P, Wiebe V, Griffiths RC, Templeton AR, Hammer tribe, and the migrant Siddi populations of Andhra
MF 1999. Ancestral Asian source(s) of new world Y- Pradesh, South India. Eur J Hum Genet, 9: 695-
chromosome founder haplotypes. Am J Hum Genet 700.
64: 817-831 Redd AJ, Roberts-Thomson J, Karafet T, Bamshad M,
Kennedy K 2000. God, Apes and Fossil Men: Jorde LB, Naidu JM, Walsh B, Hammer MF 2002.
Paleoanthropology in South Asia. Ann Arbor: Gene flow from the Indian subcontinent to Australia:
University of Michigan Press Evidence from the Y chromosome. Curr Biol, 12:
Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K, 673-677.
Parik J, Metspalu E, Adojaan M, Tolk HV, Stepanov Renfrew C 1989. The origins of Indo-European languages.
V, Golge M, Usanga E, Papiha SS, Cinnioglu C, King Sci Am, 261: 82-90.
R, Cavalli-Sforza L, Underhill PA, Villems R 2003. Rosser ZH, Zerjal T, Hurles ME, Adojaan M, Alavantic
The genetic heritage of the earliest settlers persists D, Amorim A, Amos W, et al. 2000. Y-chromo-
both in Indian tribal and caste populations. Am J somal diversity in Europe is clinal and influenced
Hum Genet, 72: 313-332 primarily by geography, rather than by language.
Kosambi DD 1964. The Culture and Civilization of Am J Hum Genet, 67: 1526-1543.
Ancient India in Historical Outline, New Delhi: Vikas Roychoudhary S, Roy S, Basu A, Banerjee R, Vishwanathan
Publishing House Pvt. Ltd. H, Usha Rani MV, Sil SK, Mitra M, Majumder PP
Macaulay V, Hill C, Achilli A, Rengo C, Clarke D, Meehan 2001. Genomic structures and population histories
W, Blackburn J, Semino O, Scozzari R, Cruciani F, of linguistically distinct tribal groups of India. Hum
Taha A, Shaari NK, Raja JM, Ismail P, Zainuddin Z, Genet, 109: 339-50.
Goodwin W, Bulbeck D, Bandelt HJ, Oppenheimer Sahoo S and Kashyap VK 2006. Phylogeography of
S, Torroni A, Richards M 2005. Single, rapid coastal mitochondrial DNA and Y-Chromosome haplogroups
settlement of Asia revealed by analysis of complete reveal asymmetric gene flow in populations of
mitochondrial genomes. Science, 308: 1034-1036 Eastern India. Am J Phys Anthropol, (in press).
Metspalu M, Kivisild T, Metspalu E, Parik J, Hudjashov Sahoo S, Singh A, Himabindu G, Banerjee J, Sitalaximi T,
G, Kaldma K, Serk P, Karmin M, Behar DM, Gilbert Gaikwad S, Trivedi R, Endicott P, Kivisild T, Metspalu
MT, Endicott P, Mastana S, Papiha SS, Skorecki K, M, Villems R, Kashyap VK 2006. A prehistory of
Torroni A, Villems R 2004. Most of the extant Indian Y chromosomes: evaluating demic diffusion
mtDNA boundaries in south and southwest Asia were scenarios. Proc Natl Acad Sci USA, 103: 843-848
likely shaped during the initial settlement of Eurasia Sambrook J, Fritsch EF, Maniatis T 1989. Molecular
by anatomically modern humans. BMC Genet, 5: 26 Cloning. A Laboratory Manual. 2nd Ed. CSHL Press,
Misra VN 2001 Prehistoric human colonization of India. Cold Spring Harbor, NY
J Biosci, 26: 491-531 Schneider S, Roessli D, Excoffier L 2000. ARLEQUIN
Nebel A, Filon D, Weiss DA, Weale M, Faerman M, ver 2.0.a software for Population Genetics Data
Oppenheim A, Thomas MG 2000. High-resolution Analysis. Geneva: Genetics and Biometry Labora-
Y chromosome haplotypes of Israeli and Palestinian tory, University of Geneva.
Arabs reveal geographic substructure and substantial Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S,
overlap with haplotypes of Jews. Hum Genet, 107: Beckman LE, De Benedictis G, Francalacci P,
630-641 Kouvatsi A, Limborska S, Marcikiae M, Mika A,
Passarino G, Cavalleri GL, Lin AA, Cavalli-Sforza LL, Mika B, Primorac D, Santachiara-Benerecetti AS,
Borresen-Dale AL, Underhill PA 2002. Different Cavalli-Sforza LL, Underhill PA. 2000. The genetic
genetic components in the Norwegian population legacy of Paleolithic Homo sapiens sapiens in extant
revealed by the analysis of mtDNA and Y Europeans: a Y chromosome perspective. Science,
chromosome polymorphisms. Eur J Hum Genet, 290: 1155-1159.
10: 521-529 Sengupta S, Zhivotovsky LA, King R, Mehdi SQ,
Pattanayak DP 1998. The language heritage of India. Edmonds CA, Chow CE, Lin AA, Mitra M, Sil SK,
In: Balasubramanian and NA Rao (Eds.): The Indian Ramesh A, Usha Rani MV, Thakur CM, Cavalli-
Human Heritage. Hyderabad. pp: 95-99 Sforza LL, Majumder PP, Underhill PA 2006.
Qamar R, Ayub Q, Mohyuddin A, Helgason A, Mazhar K, Polarity and temporality of high-resolution Y-
Mansoor A, Zerjal T, Tyler-Smith C, Mehdi SQ 2002. chromosome distributions in India identify both
Y-chromosomal DNA variation in Pakistan. Am J indigenous and exogenous expansions and reveal
Hum Genet, 70: 1107-1124 minor genetic influence of central Asian pastoralists.
Quintana-Murci L, Krausz C, Zerjal T, Sayar SH, Hammer Am J Hum Genet, 78: 202-221.
MF, Mehdi SQ, Ayub Q, Qamar R, Mohyuddin A, Shi H, Dong YL, Wen B, Xiao CJ, Underhill PA, Shen
Radhakrishna U, Jobling MA, Tyler-Smith C, PD, Chakraborty R, Jin L, Su B 2005. Y-
McElreavey K 2001. Y-chromosome lineages trace chromosome evidence of southern origin of the East
diffusion of people and languages in southwestern Asian-specific haplogroup O3-M122. Am J Hum
Asia. Am J Hum Genet, 68: 537-542. Genet, 77: 408-419
Quintana-Murci L, Semino O, Bandelt HJ, Passarino G, Singh, KS 1998. India’s Communities. National Series.
McElreavey K, Santachiara-Benerecetti AS 1999. People of India. New Delhi: Oxford University
Genetic evidence of an early exit of Homo sapiens Press.
118 R. TRIVEDI, SANGHAMITRA SAHOO, ANAMIKA SINGH ET AL.

Stringer C 2000. Coasting out of Africa. Nature, 405: haplotypes and the origins of modern human
24-25 populations. Ann Hum Genet, 65: 43-62
Su B, Xiao C, Deka R, Seielstad MT, Kangwanpong D, Underhill PA, Shen P, Lin AA, Jin L, Passarino G, Yang
Xiao J, Lu D, Underhill P, Cavalli-Sforza L, WH, Kauffman E, Bonne-Tamir B, Bertranpetit J,
Chakraborty R, Jin L 2000. Y chromosome Francalacci P, Ibrahim M, Jenkins T, Kidd JR, Mehdi
haplotypes reveal prehistorical migrations to the SQ, Seielstad MT, Wells RS, Piazza A, Davis RW,
Himalayas. Hum Genet, 107: 582-590 Feldman MW, Cavalli-Sforza LL, Oefner PJ 2000.
Thangaraj K, Sridhar V, Kivisild T, Reddy AG, Chaubey G, Y chromosome sequence variation and the history
Singh VK, Kaur S, Agarawal P, Rai A, Gupta J, Mallick of human populations. Nat Genet, 26: 358-361
CB, Kumar N, Velavan TP, Suganthan R, Udaykumar Wells RS, Yuldasheva N, Ruzibakiev R, Underhill P,
D, Kumar R, Mishra R, Khan A, Annapurna C, Singh Evseeva I, Blue-Smith J, Jin L, et al. 2001. The
L 2005. Different population histories of the Eurasian heartland: A continental perspective on Y-
Mundari- and Mon-Khmer-speaking Austro-Asiatic chromosome diversity. Proc Natl Acad Sci USA, 98:
tribes inferred from the mtDNA 9-bp deletion/ 10244-10249
insertion polymorphism in Indian populations. Hum Zhivotovsky LA, Underhill PA, Cinnioglu C, Kayser M,
Genet, 116: 507-517 Morar B, Kivisild T, Scozzari R, Cruciani F, Destro-
Thapar R 1995 The first millennium B.C. in the northern Bisol G, Spedini G, Chambers G., Herrera RJ, Yong
India. In: R. Thaper (Ed.): Recent Perspective of Early KK, Gresham D, Tournev I, Feldman MW,
Indian History. Bombay. pp. 80-141 Kalaydjieva L 2004. The Effective Mutation Rate
Underhill P, Passarino G, Lin AA, Shen P, Mirazón Lahr at Y Chromosome Short Tandem Repeats, with
M, Foley RA, Oefner PJ, Cavalli-Sforza LL 2001. Application to Human Population-Divergence Time.
The phylogeography of the Y chromosome binary Am J Hum Genet, 74: 50-61.

You might also like