Razib Khan One-stop-shopping for all of my content

August 31, 2011

“Aryan invasion”

Filed under: Genetics,Indo-Aryans — Razib Khan @ 9:15 am

Estimating a date of mixture of ancestral South Asian populations:

Linguistic and genetic studies have shown that most Indian groups have ancestry from two genetically divergent populations, Ancestral North Indians (ANI) and Ancestral South Indians (ASI). However, the date of mixture still remains unknown. We analyze genome-wide data from about 60 South Asian groups using a newly developed method that utilizes information related to admixture linkage disequilibrium to estimate mixture dates. Our analyses suggest that major ANI-ASI mixture occurred in the ancestors of both northern and southern Indians 1,200-3,500 years ago, overlapping the time when Indo-European languages first began to be spoken in the subcontinent. These results suggest that this formative period of Indian history was accompanied by mixtures between two highly diverged populations, although our results do not rule other, older ANI-ASI admixture events. A cultural shift subsequently led to widespread endogamy, which decreased the rate of additional population mixtures.

I will put a modest amount of money on the proposition that there were at least two admixture events, and that their LD based methods are picking up the second Indo-European one. If it was just one admixture event, then you have to accept the proposition that South Indian tribals are at least ~30% Indo-European in ancestry. Not impossible, but seems unlikely.

May 3, 2011

A solution to the problem of ANI origins

Filed under: Culture,Indo-Aryans — Razib Khan @ 8:39 pm

Dienekes Pontikos has a post up, A solution to the problem of Indo-Aryan origins (part 2). He argues that Indo-Aryan origins are from the trans-Caucasian region, roughly. In other words, a demographic pulse from Anatolia. The argument is made with a PCA plot which shows how South Asian populations can be modeled as a two-way admixture between Dalits and northern West Asians.

He may be right. But what I am sure he is seeing is the signal of Ancestral North Indians, ANI. If ANI were Indo-Aryan then Colin Renfrew and Peter Bellwood’s thesis that Indo-European languages spread through the expansion of farmers from Anatolia would explain this. But, as Dienekes and Zack have both observed the West Eurasian signal in South Asians does separate out at some point into distinct components. One of those minor is clearly a European affiliated one. It is elevated in Jatts in particular. I suspect that this is probably tied to the late Indo-Aryan migration. The ANI themselves may have been Dravidian, or an unknown language.

Anyway, my confidence in these models is weak at best.

December 17, 2010

South Asians too are sons of the farmers?

Filed under: Aryans,Genetic History,Genetics,Genomics,India,India Genetics,Indo-Aryans — Razib Khan @ 2:58 pm

I mentioned a few days ago that a friend was trying to get together some data to analyze the genetic variation of South Asians. By a strange coincidence Dienekes just published a more detailed analysis of South Asians…and uncovered something very interesting, though not that surprising. Some technical preliminaries:

A note of caution: The reduced marker set (~30k) means that a lot of noise is added in the admixture estimates. In particular, many individuals are likely to get low-level admixture from population sources that can be attributed to noise. But, as we will see, the small marker set does not really affect either the power of the GALORE approach, or of ADMIXTURE to infer meaningful clusters.

In addition to the various online sources of public data Dienekes got about a dozen South Asians. I was one of those South Asians, DOD075. In many ways I’m a rather standard issue South Asian, similar to Gujaratis, except that I have a substantial ‘East Asian’ component. More concretely, between 1/6 and 1/7 of my ancestry seems to be of eastern origin, far higher than the norm among South Asians. The rest of my ancestry was mostly South Asian specific, with a minor, but significant ‘West Asian’ component common across northern India.

Rerunning with more data with different samples Dienekes came out with a different set of ancestral components. Of particular interest to me he broke down the East Asian between East Asian proper and Southeast Asian. Below are a selection of populations with ancestral components + me. I’ve also renamed a few components. North Kannadi = Dravidian and Irula = Indian tribal. Indian = Generic Indian. Looking at the Fst it seems that Indian endogamy and population bottlenecks has had an effect…look at the North Kannadi distance from everyone else.


Remember that in the previous analysis I was very similar to a Gujarati, except with an East Asian element. My supposition that my ancestry has some connection to Burma seems to be supported by these results. Looking at my balanced ratio between East Asian and Southeast Asian, that is what one might expect from someone of a Burman ethnicity. I am not saying that I have recent Burman ancestry per se. Rather, Ahom, Mizo, Chakma, and a range of tribal populations from the liminal zone between South and Southeast Asia may suffice. The main other option is that I have a great deal of Munda ancestry. Not implausible in light of the likelihood that Munda brought rice agriculture to northeast South Asia, and pre-date Indo-Aryans, and possibly Dravidians, in Bengal. How would I distinguish these possibilities? I’ve ordered 23andMe kits for both my parents. The most likely candidate for recent Southeast Asian ancestry is my paternal grandfather. If the admixture event was recent, if I have a recent ancestor(s) of “hill tribe” origin, I would expect to see more linked regions of East/Southeast Asian origin than if the admixture was ancient (and so distributed more equitably across DNA strands due to recombination).

But the bigger point of Dienekes’ post is what he terms “Dagestani” ancestry across much of Eurasia. I’ll quote him:

The most exciting thing, however, is the fact that the origins of a part of the West Asian component of my previous analyses can be partially located: it is the purple component centered in Dagestan, i.e., among Northeast Caucasian speakers such as Lezgins, and the Dargins who inhabit Urkarah.

Readers of this blog may remember the surprising appearance of this Lezgin-specific component in the Balkans (but not Greeks) a few weeks ago. Now it has turned up as a substantial component in India as well.

Back then, I speculated that this component may derive from a prehistoric population that was spread in (but not limited to) the northern arc of the Black Sea from the Balkans to the Caucasus. Even in this analysis, you can see that both Romanians and Hungarians have some of it, and so do Lithuanians and Belorussians, while Tuscans (like the Greeks of my previous experiment) do not.

Hence, this component stretches from at least the Baltic to India, but is largely absent in southern Europe. I will go out on a limb and propose that this component is representative of a non-Indo-European component in the ancestors of the Indo-Iranians.

Paul Conroy observes that on this finer-grained analysis I don’t have any “West Asian” at all. What had previously been West Asian terms out to have been, in my case, a compound of Dagestani + European. I can’t say that I’m that surprised by this. Years ago I noticed that HGDP STRUCTURE analyses were always giving suggestive signs of a connection between West-Central Eurasia and South Asia.

Who were the Indo-Iranians? I lean toward the proposition that they do derive from the Andronovo culture of the Eurasian steppe. This would date the entrance and expansion of Indo-Aryans in northern India 3-4,000 years ago. I also contend that the dominant element of ancestry among modern South Asians is not Indo-Aryan. Rather, it is an ancient stabilized hybrid of pre-agricultural societies in the Indus valley and Neolithic farmers who originated from what is today western Iran and eastern Anatolia. Therefore, I posit that the “Aryanization” of the Indian subcontinent is properly modeled as the same processes which led to the emergence of an Anatolian and Rumelian Turkish identity; a small elite population which forces a identity shift among the majority.

Back to farming:

As I’ve remarked in the past, Eurasia can be broadly seen as the playground of three major groups of people: the Caucasoids of the West, the Mongoloids of the East, and a southern group of people which is most strongly represented in South Asia, but whose presence can be detected in Southeast Asia as well, although in the latter case it has been marginalized and/or absorbed by the arrival of Mongoloids.

This southern group of people has sometimes been called “Australoid” because of its perceived resemblance to Australo-Melanesians. Indeed, in my K=5 mega-analysis an affinity between Papuans/Melanesians and people of South and Southeast Asia is apparent. These “Australoids” are very old populations, probably stemming from the early Out-of-Africa coastal dispersal route, and we shouldn’t be tricked by their phenotypic similarity into thinking that different groups of them are particularly close genetically. Just as “black Africans” are not the same, neither are the “Australoids” and mixed-”Australoids” at the shores of the Indian Ocean.

It is probably the invention of agriculture that is responsible for their marginalization. In Africa, the Pygmies and Bushmen have been absorbed or pushed aside by the demographic Bantu juggernaut, with a few other language groups also hitching a ride on the agriculture/pastoralism economy. In West Eurasia, where agriculture was invented earliest, pre-agricultural populations left no traces. In East Eurasia, the agriculturalists could not expand to the far north where many relic populations exist, but they could (and did) move to the south where they assimilated or drove away pre-existing populations, leaving a few of thems, like the Taiwanese Atayal as partial remnants of the older population stratum.

The Irula are South Indian tribals, so they are the the closest one can get to South Asian autochthons, and yet even they presumably have a large minor component of “Ancestral North Indian.” The tribal groups in Reconstructing Indian Population History all exhibited proportions on the order of ~40% ANI. It seems that agriculture “stalled” in the Indus valley and the highlands to the west for thousands of years in South Asia. During this period of stalling I believe that the farmers absorbed a great deal of genetic material from the indigenous hunter-gatherers, and so produced a “distinctive” Indian genetic profile. More West Eurasian than not, but with a very large dollop of the ancient substrate of southern Eurasia which had a distant, but closer, affinity with that of East Asia. Once social and cultural forces allowed for the rapid expansion of farmers there was a wave of advance from the Indus valley east and south. In the east the proto-Indians would have encountered Mundari speaking groups drifting who practiced rice agriculture, which they also adopted. In the south the proto-Indians would have encountered more hunter-gatherers. Many of the tribal people in India are today facultative hunter-gatherers, herders, and extensive farmers. I believe that these marginal proto-Indian groups assimilated hunter-gatherers more easily than would have otherwise been the case because some of the proto-Indians reverted to a hunter-gatherer lifestyle in the agriculturally unsuitable highlands of the Deccan and Chota Nagpur. The social boundaries in the uplands of South India were such that the line between hunter-gatherer and farmer was more fluid than elsewhere, explaining the former’s greater genetic impact through intermarriage and assimilation.

This sort of general dynamic probably applies to Indo-Europeans. There is no reason why the original Indo-European tribes could not have been compounds who picked up different ancestral components in their peregrinations. Compare the various Turkic people, Anatolian Turks, Chuvash, and Yakut. All of them have affinities with nearby peoples, despite having a common Turkic culture and genetic component. One notable trend in Europe is that while the French have a minor, but significant West Asian component, the Basque have none of it. Dienekes’ sample is small, but it looks as if Scandinavians have more of this than the Finns. This West Asian component may not have been the dominant one among the Indo-Europeans, but I suspect it was a significant one. If the original speakers of proto-Indo-European did not have it, they likely absorbed early on, just as the West Asians absorbed a native South Asian element in the Indus valley.

Finally, as a general rule of thumb, I would now suggest that the primary way in which hunter-gatherer genes can persist is through an ecological stall on the part of farmers. During the stall gene flow naturally occurs, probably through exchange of females (coercive or not), or the integration of hunter-gatherer males into war-bands or as slaves. Over time the farmers on the frontier have changed genetically, so that when they start expanding rapidly due to a technological or cultural innovation, they share more with the hunter-gatherers whom they supersede than they otherwise would have.

March 27, 2010

The science of human history as written by Herodotus

The following passage is from the epilogue of The Real Eve: Modern Man’s Journey Out of Africa by Stephen Oppenheimer:

In this book I have offered a synthesis of genetic and other evidence. Everything points to a single southern exodus from Eritrea to the Yemen, and to all the non-African male and female gene lines having arisen from their respective single out-of-Africa founder lines in South Asian (or at least near the southern exit). I regard the genetic logic for this synthesis as a solid foundation, and I have based the rest of my reconstruction of the human diaspora upon it. Obviously, the ‘choice’ of starting point (mine or theirs) determined all the subsequent routes our ancestors and cousins took. Tracing the onward trails is only possible as a result of marked specificity in regional distribution of the genetic branches The geographic clarity of both male and female gene trees is a big departure from the fuzzy inter-regional picture shown by older genetic studies. The degree of segregation of lines into different countries and continents is in itself good evidence that once they got to their chosen new homes, the pioneers generally stayed put, at least until the Last Glacial maximum forced some of them to move. This conservative aspect of our genetic prehistory also provides a partial explanation for the fact that when we look at a person, we can usually tell, to the continent, where their immediate ancestors came from, and underlies differences that some of us still call ‘race.’

Oppenheimer wrote the above in the early aughts, as his book was published in 2003. Much of this is generally in line with the ‘orthodoxy’ of the day. I believe that Oppenheimer’s assertion that there was one southern migration out of Africa by anatomically modern humans has gained some advantage over the alternative model of two routes, northern and southern, over the past ten years (Spencer Wells’ The Journey of Man sketches out the two wave model). Other assertions and assumptions have not stood the test of time. In particular, I would contend that generally the ‘conservative aspect of our genetic prehistory’ can no longer be taken for granted. Specifically, it seems likely now that much occurred after the Ice Age and during the Neolithic.

420px-AGMA_HérodoteThe false inferences of the early aughts were due to two primary problems. First, they relied heavily on the powerful new techniques of extraction and analysis of uniparental ineages; the male and female direct line of descent. Concretely, mtDNA and the nonrecombintant portion of the Y chromosome. The lack of recombination allows for relatively easy reconstruction of phylogenies assuming a coalescent model. Second, the inferences attempt to make connections between the patterns of variation in modern populations, and what one may infer about the past from those patterns. Obviously constructing a phylogeny, or plotting haplogroup frequencies as a function of geography, is rather straightforward science. But using these results to generate inferences of the past is often more of an art than a science, and implicit assumptions lurk behind the causal chains. Consider for example the utilization of modern Anatolian (i.e., Turkish) genetic variation as a reference for the expansion into Europe of Neolithic farmers from the Near East. This of course presumes that modern Anatolians are a good proxy for ancient Anatolians. There are various suggestive reasons for why this is a plausible assumption, but assemble enough plausible assumptions, and rely on their joint likelihood, and you construct a very rickety machinery of possibility.

In early 2007 I began to have serious doubts about the orthodoxy of genetic conservatism. The primary trigger was the story of the Etruscans. Here is the crux of the issue: there are two models for the origins of the Etruscans, first, that they were the pre-Indo-European autochthons of Italy, or, that they were the migrants from the eastern Mediterranean, in particular Anatolia. The second may seem an outlandish hypothesis, but there were several tendrils of evidence to support it. But perhaps the ’support’ which weighed most against it is that the fact that the Anatolian model has an ancient source, the Greek historian Herodotus. I should perhaps put historian in quotes as well, because Herodotus is often viewed more as a repeater of myths, and derided by some as the ‘father of lies’ (in this he stands in sharp contrast to contemporary perceptions of the ‘modern’ Thucydides, though revisionists have begun to challenge this narrative). In contrast, the model that Etruscans are indigenous to Italy, and that their ‘exotic’ foreign traits were simply acquired through trade and cultural diffusion, dovetailed well with the post-World War II ‘pots not peoples’ paradigm. That cultural change was ubiquitous, while at the same time populations were immobile. It was boring, prosaic, and conservative, and so an ideal null hypothesis.

But here it turns out that Herodotus was right, and archaeologists were wrong. Genetic analysis of modern Tuscans from isolated villages shows that some are surprisingly closely related to extant eastern Mediterranean lineages. Genetic analysis of Tuscan cattle showed that they were surprisingly closely related to extant eastern Mediterranean lineages of cattle. Finally, extraction of ancient Etruscan DNA showed that they were closely related to extant eastern Mediterranean lineages. The overlap was often with Anatolia, and combined with fragmentary linguistic and archaeological data, the evidence clearly points to an exogenous origin for the Etruscans. The boring null hypothesis was wrong. After these genetic stories gained prominence I went and reread recent archaeological texts on the Etruscans, and there were many models which showed exactly how Etruscan cultural uniqueness derived back to prehistoric Italy. It seems in hindsight that the prior assumption served as an interpretative filter, and people saw patterns that they were primed to see based on what they ‘knew’ to be the history of prehistoric and early Iron Age Tuscany.

Of course to refute the primacy of Oppenheimer’s conservative model of genetics one has to offer more examples than that of the Etruscans, and in particular, examples which are of greater scope and weight. I believe those examples exist. In the early aughts based on the mtDNA evidence the likelihood was that South Asian genetic variation is by and large a product of changes wrought upon the basic elements extant in the region around the end of the last Ice Age. The Y chromosomal data was more confused, though it did imply a closer relationship to groups in western Eurasia. But based on the mtDNA Oppenheimer posited a model whereby India was the mother of all non-Africans, that is, all non-African lineages derived from roots within the Indian subcontinent before the Last Glacial Maximum. This is at sharp variance with colonialist narratives of an Aryan invasion of the subcontinent, and the subjugation of the natives by quasi-European overlords, who are the ancestors of the moder upper castes. The charged ideological import of this model is transparently obvious.

Unfortunately the reality is likely more complex. I suspect that some form of Oppenheimer’s model is correct, insofar as South Asia was likely an important way station for modern humans as they left Africa, and pushed into other regions of Eurasia, on to Australasia and the New World. This interpretation does gain support from mtDNA, the direct maternal lineage. But a new analysis of South Asian genetic variation using a substantial proportion of the autosomal genome implies in fact that South Asians are possibly descendants of an ancient hybridization event between a native population with deep roots in the subcontinent, and a quasi-European population which was exogenous to the subcontinent.* Genetically the quasi-European population is quite close to northern Europeans, similar to the genetic distance between modern Finns and Italians, not trivial, but far closer than that between modern South Asians and Europeans. Was this the ancient Aryan invasion? I remain skeptical of this particular detail for various reasons, as I suspect that the history of the Indian subcontinent is in fact even more complex than has been assumed before (I think it is more likely that the quasi-Europeans came before the Indo-Aryans, who arrived late, and had a stronger cultural than genetic influence).

Finally, there is another region of the world where it seems likely that the old orthodoxies of genetic conservatism will be overthrown. That region is Europe. The scientific orthodoxy of deep time continuity is strong enough that it has percolated into the public consciousness, the leader of the British National Party even referred to the deep roots of white British in demarcating who he believed ‘indigenous people’ of the Isles were. But newer data is more supportive of the hypothesis that in fact Neolithic farmers who arrived from elsewhere are the likely ancestors of most Europeans, not the hunter-gatherers who remained after the Ice Age. Extraction of ancient DNA has yielded a set of results which simply are not explicable assuming the older models of genetic continuity, which were based on inferences made from modern population variation. If I had to hazard a guess, I would have some, though not high, confidence in the following story. First, the indigenous hunter-gatherers are assimilated or marginalized by waves of Neolithic farmers pushing out from the eastern Mediterranean. The demographic expansion does not necessarily sweep outward along a southeast-northwest axis, rather, it follows the Mediterranean and Atlantic fringes, as well as along river systems in the interior. Its impact is weakest in the northeast of Europe, where Middle Eastern crops are least suitable, and the natives have the most time to absorb the cultural toolkit of the newcomers so as to resist their advance. Second, and far later, there was another wave pushing out from the region of the Ukraine to the Volga, likely the ancestors of the Indo-Europeans. Tentatively I would contend that these were the carriers of the Kurgan culture, and also brought the allele for lactase persistence. Again, for ecological reasons the populations of the northeast Baltic and into the forests of northern Russia were most insulated from this push (and non-Indo-European languages persisted in Iberia down to Roman times, and specifically in the Basque-country down to modern times, though I suspect this is a function of distance). So modern European populations may be assumed to be tri-hybrid, first a synthesis of Middle Eastern farmers overlain upon the Paleolithic substrate, and second a synthesis of Indo-Europeans from the east overlain upon pre-Indo-European substrate. Unlike the case of India I suspect teasing out these patterns in modern populations is more difficult because the genetic distance between the three ancestral populations is far smaller than between the indigenous peoples of India before the quasi-Europeans arrived.

This leaves much of the world untouched by my speculations, but I believe showing that the genetically conservative null hypothesis is now in serious doubt in South Asia and Europe is sufficient to knock it from being a necessarily default assumption through which we must filter our interpretations. I do not believe that the reordering of human variation and the welter of population movement after the Ice Age was equivalent in effect to the Out of Africa migration, but I do believe that it was important enough to make the world of 2000 BCE very different from that of 15000 BCE in regards to genetic variation. In some cases, such as Central Asia from the Caspian to the Taklamakan the world of 2000 CE is fundamentally different from the world of 0 CE.

I will then end with a prediction, one in which I do not have much confidence, but which may no longer be wrong on the face of it with these new data in mind. Here is a passage from page 7 of Jared Diamond’s Guns, Germs, and Steel:

Initially, archaeologists considered the possibility that the colonization of Australia/New Guinea was achieved accidentally by just a few people swept to sea while fishing on a raft near an Indonesian island. In an extreme scenario the first settlers are pictured as having consisted of a single pregnant young woman carrying a male fetus…..

Let me stipulate that Diamond seems skeptical of the extreme model, but it illustrates the consensus that Australian Aboriginal populations are descended from the first settlers. That is, the modern populations of indigenous Australians are the direct descendants of those who swept Out of Africa along the fringe of the Indian ocean, through Southeast Asia, and arrived in Australia (more specifically, Sahul), on the order of 40 to 60 thousand years ago. From what genetic data I have seen this may be true. But I do not know of any extractions of ancient DNA, and it seems to me that the analysis of the phylogenetics of Australian Aboriginals is relatively sketchy. Therefore, I will suggest that within the last 10,000 years there has been a major new migration of people into Australia, and the modern range of genetic variation of Australian Aboriginals is significantly different from that of the populations of the Ice Age. I suggest this primarily because the dingo arrived within the last 10,000 years, more likely as recently as 4,000 years ago. With the expansion of the utility of ancient DNA extraction and analysis this question may be answered in the near future. I would still bet I’m wrong with the hypothesis I just offered, but I’m far less sure than I would have been 2 years ago.

Note: This post emerged from a conversation I had with Kevin Zelnio and Dave Munger.

* I say ‘quasi-European’ because the population may have origins outside of the boundaries of modern Europe at the Urals. Perhaps in western Siberia. Additionally, the idea of ‘Europe’ is relatively new, and exhibits little ancient cultural coherency.

Image source: Wikipedia

Powered by WordPress