Razib Khan One-stop-shopping for all of my content

May 16, 2018

Migration at the roof of West Asia

Filed under: Historical Population Genetics,History,Indo-Europeans — Razib Khan @ 10:16 pm
Click to see the full figure

The figure to the left is from The genetic prehistory of the Greater Caucasus. If you are a regular reader of this weblog, or Eurogenes, you can figure out what’s going on, and keep track of the terminology. But in 2018 I think we’re getting to the end of the line in making sense of “admixture graphs” in relation to West Eurasian population structure. The models are just getting too complicated to keep everything straight, and the distinct-populations-subject-to-pulse-admixture seems to be an assumption that may not necessarily hold.

To get a sense of what I’m talking about, the above preprint focuses on populations in and around the Caucasus region. One of the major reasons that this is important is that the Caucasus was and is to some extent a continental hinge, connecting Eastern Europe and the Pontic steppe, to the Near East. The Arab Muslims pushed north of the Caucasus, and came into conflict with the Khazars, while Cimmerians and Scythians moved south from the Pontic steppe.

The elephant in the room is the relevance to the “Indo-European controversy.” Colin Renfrew long ago posited that the Indo-European languages derive from West Asian farmers who expanded into Europe as early as ~9,000 years ago. A rival theory is that Indo-Europeans spread out of the Pontic steppe ~4,000 years ago. In 2015 two major papers suggested that the steppe was a major source of Indo-European expansion. Case closed? This preprint suggests perhaps not.

But we’ll get to that later. What do the results here show? The prose is a little hard to tease apart, but the major issues seem to be that in antiquity, or at least the period they’re focusing on, much of the gene flow seems to have been south (Near East) to the north (through the Caucasus, and out to the north slope). To some extent, we already knew this: the Yamna people of the Pontic steppe have “southern” ancestry from the Near East that earlier East European/Pontic people do not. In this preprint, the authors show that groups such as the Maykop of the north slope of the Caucasus carry Y haplogroups such as G2, and not the R1 lineages commonly found in the steppe. David W. suggests that this confirms that Near Eastern gene flow into the steppe was female-mediated.  This is plausible, but I would caution that Y chromosomes alone can be deceptive, due to the power of particular patrilineages. We’ll probably rely on the X chromosome to make a final judgment.

The plot below shows many of the relationships as a function of location and time. The green component is modal among “Iranian farmers,” the orange among “Anatolian farmers,” and the blue among “Western hunter-gatherers.”

A major aspect of this preprint is that it has to work hard to differentiate two Anatolian farmer-like signals: the first, from Anatolian farmers proper, and the second from the descendants of European farmers, who themselves are a mix of Anatolian farmers with a minority ancestry among the hunter-gatherers. The answers would probably be totally unintelligible if not for archaeology. It’s clear that the steppe people had contact with both European and Near Eastern farmers and that later East European groups that succeeded the Yamna were subject to reflux from Central Europe, and received European farmer ancestry.

Another curious nugget in their results is that there was early detection of both Ancestral North Eurasian (ANE) ancestry and, some East Eurasian gene flow (related to Han Chinese). One of their individuals carries the East Eurasian variant of EDAR, which today is only found in Finns, though it was found in reasonable frequencies among the Motala hunter-gatherers of Scandinavia. Additionally, Fu et al. 2016 found that the ancestors of Mesolithic hunter-gatherers received some gene flow from Eastern Eurasians as well (also in the supplements of Lazaridis et al. 2016).

The authors admit that there is probably population structure among ANE and undiscovered groups of East Eurasians who were traversing the Inner Asian landscape. I think this is all suggestive of some long-distance contacts, though the intensity and magnitude increased a lot with high-density societies and the mobility of pastoralism.

Much of the genetic mixing in the Near East, and to some extent in the trans-Caucasian region, seems to date to the 4th millennium. This is technically prehistory, but it is also the Uruk period. This was a phase of Mesopotamian culture expansion between 4000 and 3100 BC which resulted in replicas of Uruk style settlements as far away as Syria and southeastern Anatolia. There is even evidence of Uruk-related migration to the North Caucasus.

The Uruk experienced abrupt and sudden collapse. Uruk settlements outside of the core zone of Mesopatamia disappear.

It’s the final paragraph that warrants discussion:

The insight that the Caucasus mountains served not only as a corridor for the spread of CHG/Neolithic Iranian ancestry but also for later gene-flow from the south also has a bearing on the postulated homelands of Proto-Indo-European (PIE) languages and documented gene-flows that could have carried a consecutive spread of both across West Eurasia…Perceiving the Caucasus as an occasional bridge rather than a strict border during the Eneolithic and Bronze Age opens up the possibility of a homeland of PIE south of the Caucasus, which itself provides a parsimonious explanation for an early branching off of Anatolian languages. Geographically this would also work for Armenian and Greek, for which genetic data also supports an eastern influence from Anatolia or the southern Caucasus. A potential offshoot of the Indo-Iranian branch to the east is possible, but the latest ancient DNA results from South Asia also lend weight to an LMBA spread via the steppe belt…The spread of some or all of the proto-Indo-European branches would have been possible via the North Caucasus and Pontic region and from there, along with pastoralist expansions, to the heart of Europe. This scenario finds support from the well attested and now widely documented ‘steppe ancestry’ in European populations, the postulate of increasingly patrilinear societies in the wake of these expansions (exemplified by R1a/R1b), as attested in the latest study on the Bell Beaker phenomenon….

But instead of tackling this let’s focus on the paper that came out of the Willerslev group, The first horse herders and the impact of early Bronze Age steppe expansions into Asia. This is a final manuscript in Science. That means it was probably written before The Genomic Formation of South and Central Asia. When it comes to South Asia, the results from the two publications are consanant. There is no conflict.*

More interesting are the results in West Asia, and the linguistic supplement. In the authors note that tablets now indicate an Indo-Aryan presence in Syria ~1750 BC. Second, Assyrian merchants record Indo-European Hittite, or Nesili (the people of Nesa), as early as ~2500 BC.

As suggested in earlier work Hittite remains don’t suggest steppe influence. David W. says:

The apparent lack of steppe ancestry in five Hittite-era, perhaps Indo-European-speaking, Anatolians was interpreted in Damagaard et al. 2018 as a major discovery with profound implications for the origin of the Anatolian branch of Indo-European languages.

But I disagree with this assessment, simply because none of these Hittite-era individuals are from royal Hittite, or Nes, burials. Hence, there’s a very good chance that they were Hattians, who were not of Indo-European origin, even if they spoke the Indo-European Hittite language because it was imposed on them.

The main aspect I’d bring up with this is that in other areas steppe ancestry has spread deeply and widely into the population, including non-Indo-European ones. It is certainly possible that the sample is not needed enough to pick up the genuinely Hittite elite, but I probably lean to the likelihood that the steppe signal won’t be found. It seems that the Anatolian languages were already diversified by ~2000 BC, and perhaps earlier. Linguists have long suggested that they are the outgroup to other Indo-European languages, though this could just be a function of their isolation among highly settled and socially complex populations.

Two alternative models present themselves for these results. The Anatolian Indo-European languages expanded through elite diffusion,  part of the same general migrations that emerged out of the Yamna culture ~3000 BC. The lack of a steppe signal may be due to sampling bias, as David W. suggested, or, more likely in my opinion, simple dilution of the signal. Second, the steppe migrations were one part of a broader palette of population movements and cultural diffusions, and the Anatolian Indo-Europeans are basal to the efflorescence of the steppe derived branches.

The evidence of the explosion of Indo-Aryans in the years after 2000 BC in West and South Asia, as well as the expansion of Iranians across vast swaths of Inner Asia during the same period, suggest to me that Indo-Iranians are most definitely part of the steppe pulse. The connection to the Sintashta charioteers presents itself, and, connections to the Uralic languages indicates incubation in the trans-Volga region.

In West Asia, the Indo-Aryans crashed themselves against the most advanced civilizations of their time. Like the Bulgars, and unlike the Hittites, Indo-Aryan Mitanni was totally absorbed by their non-Indo-European Hurrian substrate. Indo-Aryan linguistic influence was preserved in their names, their gods, and in particular words relating to chariots. And yet in 2017’s Continuity and Admixture in the Last Five Millennia of Levantine History from Ancient Canaanite and Present-Day Lebanese Genome Sequences, the authors observe:

We next tested a model of the present-day Lebanese as a mixture of Sidon_BA and any other ancient Eurasian population using qpAdm. We found that the Lebanese can be best modeled as Sidon_BA 93% ± 1.6% and a Steppe Bronze Age population 7% ± 1.6% (Figure 3C; Table S6). To estimate the time when the Steppe ancestry penetrated the Levant, we used, as above, LD-based inference and set the Lebanese as admixed test population with Natufians, Levant_N, Sidon_BA, Steppe_EMBA, and Steppe_MLBA as reference populations. We found support (p = 0.00017) for a mixture between Sidon_BA and Steppe_EMBA which has occurred around 2,950 ± 790 ya (Figure S13B).

This needs to be more explored. The admixture could have come from many sources. I am curious about the frequency of R1a1a-z93 among modern-day Syrians and Lebanese.

For me these arguments can only be resolved with a deeper understanding of linguistic evolution. The close relationship of Indo-Aryan and Iranian languages is obvious to any speaker of either of these languages (I can speak some Bengali). A divergence in the range of 4 to 5 thousand years before the present seems most likely to me. But the relationship of the other Indo-European languages is much less clear.

One of the arguments in Peter Bellwood’s First Farmers is that the Indo-European languages exhibit a “rake-like” topology with the exception of Indo-Iranian, which forms a clear clade. To him and others in his camp, this argues for deep divergences very early in time.

It is hard to deny that the steppe migrations between 4 and 5 thousand years ago had something to do with the distribution of modern Indo-European languages. But, it is harder to falsify the model that there were earlier Indo-European migrations, perhaps out of the Near East, that preceded these. Only a deeper understanding of linguistic evolution, and multidisciplinary analysis of regional substrates will generate the clarity we need.

* I’m going to skip the Botai angle in this post.

November 12, 2017

Near Prehistory in Northern Europe was an Indo-European world

Filed under: Indo-European,Indo-Europeans — Razib Khan @ 9:01 pm

The Picts were the topic of discussion on this week on In Our Time. They are a mysterious yet intriguing people because we don’t know much about them in their own words, but, they are one of the roots of modern Scottish identity. When I first encountered the Picts decades ago there was some debate as to whether they were a pre-Indo-European people or not. Today that seems to not be a hypothesis people entertain. Rather, the Picts were simply the least Romanized of the Brythonic Celtic people of Britain.

Today because of the genetic data I think we can be rather confident that by the time of the Roman Empire there were no non-Indo-Europeans left in Northern Europe. The Beaker people in Britain and Ireland seem to have overwhelmingly replaced the native population of farmers, whose ancestors had predominantly arrived from the eastern Mediterranean thousands of years ago (via the Atlantic littoral or Central Europe). Across Northern Europe, in general, the replacement of the previous populations was substantial, though not total.

In Southern Europe, the arrival of Indo-Europeans was more fitful, and persistence of Basque attests to the fact that non-Indo-European languages were spoken down to historical times (if Etruscan is considered native to the Italian peninsula, that’s another example, though this is hotly debated and I lean toward the exogenous model). The pre-Latin language of Sardinia was almost certainly not Indo-European, while Greek has a high proportion of non-Indo-European words in its lexicon.

 

June 24, 2017

Indian genetics, part n of many

Filed under: ancient india,History,Indo-Europeans — Razib Khan @ 2:59 pm

I put up close to definitive piece for me in relation to South Asian historical population genetics. At least until new research is published. I did leave out some stuff about my own vague thoughts…but I think the takeover of Hattian and Hurrian cultures by the Nesha (Hittites) and Haryannu (Mitanni) have something to teach us….

June 19, 2017

Indian genetics, the never-ending argument

Filed under: Genetics,India,Indian Genetics,Indo-Europeans,science — Razib Khan @ 10:44 pm

I am at this point somewhat fatigued by Indian population genetics. The real results are going to be ancient DNA, and I’m waiting on that. But people keep asking me about an article in Swarajya, Genetics Might Be Settling The Aryan Migration Debate, But Not How Left-Liberals Believe.

First, the article attacks me as being racist. This is not true. The reality is that the people who attack me on the Left would probably attack magazines like Swarajya as highly “problematic” and “Islamophobic.” They would label Hindu nationalism as a Nazi derivative ideology. People should be careful the sort of allies they make, if you dance with snakes they will bite you in the end. Much of the media lies about me, and the Left constantly attacks me. I’m OK with that because I do believe that the day will come with all the ledgers will be balanced. The Far Left is an enemy of civilization of all stripes. I welcome being labeled an enemy of barbarians. My small readership, which is of diverse ideologies and professions, is aware of who I am and what I am, and that is sufficient. Either truth or power will be the ultimate arbiter of justice.

With that out of the way, there this one thing about the piece that I think is important to highlight:

To my surprise, it turned out that that Joseph had contacted Chaubey and sought his opinion for his article. Chaubey further told me he was shocked by the drift of the article that appeared eventually, and was extremely disappointed at the spin Joseph had placed on his work, and that his opinions seemed to have been selectively omitted by Joseph – a fact he let Joseph know immediately after the article was published, but to no avail.

Indeed, this itself would suggest there are very eminent geneticists who do not regard it as settled that the R1a may have entered the subcontinent from outside. Chaubey himself is one such, and is not very pleased that Joseph has not accurately presented the divergent views of scholars on the question, choosing, instead to present it as done and dusted.

I do wish Tony Joseph had quoted Gyaneshwer Chaubey’s response, and I’d like to know his opinions. Science benefits from skepticism. Unfortunately though the equivocation of science is not optimal for journalism, so oftentimes things are presented in a more stark and clear manner than perhaps is warranted. I’ve been in this position myself, when journalists are just looking for a quote that aligns with their own views. It’s frustrating.

There are many aspects of the Swarajya piece I could point out as somewhat weak. For example:

The genetic data at present resolution shows that the R1a branch present in India is a cousin clade of branches present in Europe, Central Asia, Middle East and the Caucasus; it had a common ancestry with these regions which is more than 6000 years old, but to argue that the Indian R1a branch has resulted from a migration from Central Asia, it should be derived from the Central Asian branch, which is not the case, as Chaubey pointed out.

The Srubna culture, the Scythians, and the people of the Altai today, all bear the “Indian” branch of R1a. First, these substantially post-date 6000 years ago. I think that that is likely due to the fact that South Asian R1a1a-Z93 and that of the Sbruna descend from a common ancestor. But in any case, the nature of the phylogeny of Z93 indicates rapid expansion and very little phylogenetic distance between the branches. Something happened 4-5,000 years ago. One could imagine simultaneous expansions in India and Central Asia/Eastern Europe. Or, one could imagine an expansion from a common ancestor around that time. The latter seems more parsimonious.

Additionally, while South Asians share ancestry with people in West Asia and Eastern Europe, these groups do not have distinctive South Asian (Ancestral South Indian) ancestry. This should weight out probabilities as to the direction of migration.

Second, I read some of the papers linked to in the article, such as Shared and Unique Components of Human Population Structure and Genome-Wide Signals of Positive Selection in South Asia and Y-chromosomal sequences of diverse Indian populations and the ancestry of the Andamanese. The first paper has good data, but I’ve always been confused by the interpretations. For example:

A few studies on mtDNA and Y-chromosome variation have interpreted their results in favor of the hypothesis,70–72 whereas others have found no genetic evidence to support it.3,6,73,74 However, any nonmarginal migration from Central Asia to South Asia should have also introduced readily apparent signals of East Asian ancestry into India (see Figure 2B). Because this ancestry component is absent from the region, we have to conclude that if such a dispersal event nevertheless took place, it occurred before the East Asian ancestry component reached Central Asia. The demographic history of Central Asia is, however, complex, and although it has been shown that demic diffusion coupled with influx of Turkic speakers during historical times has shaped the genetic makeup of Uzbeks75 (see also the double share of k7 yellow component in Uzbeks as compared to Turkmens and Tajiks in Figure 2B), it is not clear what was the extent of East Asian ancestry in Central Asian populations prior to these events.

Actually the historical and ancient DNA evidence both point to the fact that East Asian ancestry arrived in the last two thousand years. The spread of the first Gokturk Empire, and then the documented shift in the centuries around 1000 A.D. from Iranian to Turkic in what was Turan, signals the shift toward an East Asian genetic influx. Alexander the Great and other Greeks ventured into Central Asia. The people were described as Iranian looking (when Europeans encountered Turkic people like Khazars they did note their distinctive physical appearance).

We have ancient DNA from the Altai, and those individuals initially seemed overwhelmingly West Eurasian. Now that we have Scythian ancient DNA we see that they mixed with East Asians only on the far east of their range.

The second paper is very confused (or confusing):

The time divergence between Indian and European Y-chromosomes, based on the closest neighbour analysis, shows two different distinctive divergence times for J2 and R1a, suggesting that the European ancestry in India is much older (>10 kya) than what would be expected from a recent migration of Indo-European populations into India (~4 to 5 kya). Also the proportions suggest the effect might be less strong than generally assumed for the Indo-European migration. Interestingly, the ANI ancestry was recently suggested to be a mix of ancestries from early farmers of western Iran and people of the Bronze Age Eurasian steppe (Lazaridis et al. 2016). Our results agree with this suggestion. In addition, we also show that the divergence time of this ancestry is different, suggesting a different time to enter India.

Lazaridis et al. accept a mass migration from the steppe. In fact, the migration is to such a magnitude that I’m even skeptical. Also, there couldn’t have been a European migration to South Asia during the Pleistocene because Europeans as we understand them genetically did not exist then!!!

I assume that many of the dates of coalescence are sensitive to parameter conditions. Additionally, they admit limitations to their sampling.

Ultimately the final story will be more complex than we can imagine. R1a is too widespread to be explained by a simple Indo-Aryan migration in my opinion. But we can’t get to these genuine conundrums if we keep having to rebut ideologically motivated salvos.

Related: Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts. I wish David would be a touch more equivocal. But I have to admit, if the model fits, at some point you have to quit.

April 4, 2017

How a Eurasian “band of brothers” shaped the world

Filed under: Anthroplogy,Corded Ware,History,Indo-Europeans — Razib Khan @ 1:10 pm


When I was eight years old I saw a map which genuinely confused me. I had opened up deluxe dictionary at my elementary school and saw a map of the world’s language families, and noticed that there were a group of dialects which spanned the Bay of Bengal to the North Sea. In fact, according to this map the language I had first learned to speak, Bengali, was in the same language family as English.

This was hard to wrap my mind around, but there it was in front of me. Further research at the public library confirmed this fact. And, upon further reflection it was obvious to me there were similarities…I had been learning French at school, and English, Bengali, and French, all exhibited similarities in the first ten numbers. English and French I understood in terms of a natural relationship, but Bengali?

My personal and professional interests have never been in domains where I would explore the topic first hand, but the origins of Indo-European languages have always been a hobby. I read books such as The Horse, the Wheel, and Language and In Search of the Indo-Europeans when I could. When taking in excellent works such as Empires of the Silk Road the Indo-European thread was always something I kept in mind.

But the above works take a more old-fashioned Eurasian heartland “marauders from the steppe” viewpoint. Starting about 15 years ago I began to look into a different framework: Indo-Europeans as farmers. For me begins with the 2002 paper, Mapping the Origins and Expansion of the Indo-European Language Family, which finds that “the inferred timing and root location of the Indo-European language trees fit with an agricultural expansion from Anatolia beginning 8000 to 9500 years ago” (this is the last paper I can remember reading in paper format). The model is elaborated by Peter Bellwood in works such as First Farmers, though he applies it to most language families.

But its origins go back decades, with the archaeologist Colin Renfrew. Rather than dramatic explosions from the steppe, Renfrew and colleagues suggest that the demographic expansion enabled by agriculture as a mode of production allowed for groups like Indo-Europeans to rapidly swamp their neighbors and enter into a process known as a wave of advance. There wasn’t a organized movement. Rather, farming enables the growth of population to such an extent that it was almost an undirected thermodynamic law that the original farmers would radiate outward, away from zones at the Malthusian carrying capacity and out toward virgin land.

It was a parsimonious theory, and phylogenetic techniques seem to have supported it. But then came ancient DNA to overturn the apple-cart. I won’t reshash what you probably already know, but will point to the two most relevant papers, Massive migration from the steppe was a source for Indo-European languages in Europe and Population genomics of Bronze Age Eurasia. Basically there was massive population turnover during the early Bronze Age. The genetic data aligned well with predictions you’d make from the old “marauders from the steppe” model, not the demic diffusion of farmers who were subject to high endogenous population growth over time.

Of course the Anatolian model proponents have an answer. There is a thesis whereby the steppe pastoralists derive from Anatolians, and so the European population turnover was of one Indo-European group by another. This is possible, but to my knowledge this model was never foregrounded by Anatolianists before. Rather, it strikes me as a way to “save” their framework.

So far much of the battle has been between archaeologists, who tend to favor gradualism, and often even  cultural diffusion as opposed to migration, and historical linguists and arriviste geneticists, who tend toward a more classical migration-from-the-steppe perspective.

A new paper in Antiquity takes the sledgehammer to the Anatolian hypothesis with an archaeology first tack. Re-theorising mobility and the formation of culture and language among the Corded Ware Culture in Europe. They don’t pull punches:

…the Anatolian hypothesis must be considered largely falsified. Those Indo-European languages that later came to dominate in western Eurasia were those originating in the migrations from the Russian steppe during the third millennium BC.

Why would they say this? There is a major paper coming out:

These local processes of social integration between intruding Yamnaya/Corded Ware populations and remnant Neolithic populations can be applied to language dispersal. We should expect that the transformation from Proto-Indo-European to Pre-Proto Germanic would reveal the same kind of hybridisation between an earlier Neolithic language of the Funnel Beaker Culture, and the incoming Proto-Indo-European language. This is precisely what recent linguistic research has been able to demonstrate (Kroonen & Iversen in press). In their study on the formation of Proto-Germanic in Northern Europe, Kroonen and Iversen document a bundle of linguistic terms of non-Indo-European origin linked to agriculture that were adopted by Indo-European-speaking groups who were not fully fledged farmers.

They also contend that the Neolithic language was roughly the same throughout the zone of Indo-European expansion. From what those who would know about these sorts of things have told me this is plausible, because the Neolithic farmers spread so rapidly from a small founder culture, and exhibited broad Europe-wide similarities for a thousand years. Curiously, the chart shows that Germanic languages may have been influenced by a hunter-gatherer language, which the others were not. I suspect this may have to do with the relatively late persistence of hunter-gatherers in some maritime environments facing the Baltic and North Sea.

The paper, which is open access, needs to be read in full. Here are some important points:

  • Burial type seems to be a more robust form of indicator of dominant cultural identity
  • Corded Ware males practiced exogamy
  • Corded Ware males traveled long distances
  • Corded Ware culture was initially exclusively pastoralist
  • There is a great deal of circumstantial, and some genetic, evidence that Corded Ware communities were characterized by having women who were clearly from the Neolithic farming population
  • There was intergroup violence as a function of culture
  • The Corded Ware and Neolithic populations persisted near each other geographically, though the Neolithic groups seem to have retreated to uplands
  • The Corded War engaged in a wholesale pattern of landscape sculpting, burning down forests to produce pasture

Neolithic Y lineages, such as G2, are far rarer in Northern Europea today that R1a and R1b (in contrast, the hunter-gatherer I seems to have gone through an expansion just like R1a and R1b). We already have a model for what went on here, the Iberian settlement of the New World. Among mestizo populations there are huge skews of mtDNA and Y, with the former almost all Amerindian (with some African) and the latter almost all European (with some African).

The Corded War are the ancestors of the German peoples who we see emerge into the light of history during antiquity. What these data are telling is that the Germans are the product of a massive period of biological and cultural amalgamation and synthesis between indigenous groups and intrusive populations from the steppe. The archaeological data indicate that the intrusion was male mediated. The “battle axe” culture probably lived up to its name. And they weren’t likely exceptional….

September 10, 2012

The West Asian mix

Filed under: Genetics,Indo-Europeans — Razib Khan @ 7:42 am

IE-speaking West Europeans are West Asian-admixed relative to Non-IE speaking Basques. Dienekes explicitly confirms what seems obvious using ADMIXTURE. When I get a chance I’m going to see if this difference is evident when comparing some South Indian (non-Brahmin samples) I have against Gujaratis. For what it’s worth I am told that ADMIXTOOLS will be out this week.

August 16, 2012

Rise of the planet of the Indo-Europeans

Filed under: Anthroplogy,History,Indo-Europeans — Razib Khan @ 9:00 am

In response to my post below a friend emailed me the above sentence. As I suggest below it sounds crazy, and I don’t know if I believe it. But here’s an abstract from the Reich lab from June:

Estimating a date of mixture of ancestral South Asian populations

Linguistic and genetic studies have demonstrated that almost all groups in South Asia today descend from a mixture of two highly divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners and Europeans, and Ancestral South Indians (ASI) not related to any populations outside the Indian subcontinent. ANI and ASI have been estimated to have diverged from a common ancestor as much as 60,000 years ago, but the date of the ANI-ASI mixture is unknown. Here we analyze data from about 60 South Asian groups to estimate that major ANI-ASI mixture occurred 1,200-4,000 years ago. Some mixture may also be older—beyond the time we can query using admixture linkage disequilibrium—since it is universal throughout the subcontinent: present in every group speaking Indo-European or Dravidian languages, in all caste levels, and in primitive tribes. After the ANI-ASI mixture that occurred within ...

July 15, 2012

Continuing the search for Indo-Europeans

Filed under: History,Indo-Europeans — Razib Khan @ 1:53 pm

Dienekes P. is often rather laconic in commentary on the papers he links to, but of late he has “come out of his shell.” He has two posts which are important “weekend reading”:

- Population strata in the West Siberian plain (Baraba forest steppe)

- Hints of East/Central Asian admixture in Northern Europe

I freely admit that much of the conjecture here is above my pay-grade in terms of evaluation. But I do think it’s important think through. My “gut” tends to lean toward a “revenge of the Mesolithic” scenario promoted by some of Dienekes’ critics, but I don’t have a strong position.

July 3, 2012

Has Dienekes Pontikos found the signature of the Indo-Europeans?

Filed under: Anthroplogy,History,Indo-Europeans,Linguistics — Razib Khan @ 12:35 pm

I don’t know the answer to the question posted in title above, and I’m moderately skeptical that he has. But I wanted to give him full credit in the public record if researchers confirm his findings in the next few years. You can read the full post at his weblog, but basically he found that a West Asian modal element in a north British (Orkney) and Lithuanian individual seems to be negatively correlated with a Northwest European modal element and positively correlated with Near Eastern and South Asian components on a genomic level across different models in ADMIXTURE (e.g., does “South Asian” at K = 5 tend to match “West Asian” at K = 8).

Two major concerns:


- I don’t have a good intuition for this method. Could this be an artifact of the algorithm?

- When you have a hypothesis in mind you can unconsciously seek out confirmatory points. As you can see in the comments below Dienekes and his interlocutors have given this issue much thought. Frankly, I found it difficult to follow a lot of the dialogue, and I follow this topic more than most.

It seems that at this point someone should do follow up analyses ...

July 1, 2012

The mystery of the origin of the Indo-Europeans may be solved within the next 2 years

Filed under: Anthroplogy,Indo-Europeans — Razib Khan @ 12:27 pm

Dienekes has a post up, The Bronze Age Indo-European invasion of Europe. The crux of his argument is as such:

But there is another component present in modern Europe, the West_Asian which is conspicuous in its absence in all the ancient samples so far. This component reaches its highest occurrence in the highlands of West Asia, from Anatolia and the Caucasus all the way to the Indian subcontinent. It is well represented in modern Europeans, reaching its minima in the Iberian peninsula….

Thanks to the public release of genetic data Dienekes has developed his theories in part out of his own analyses of said data. Though I’ve run fewer analyses, with smaller data sets, some of the same patterns jump out at me. In particular, there is a component which is modal in northern West Asia (e.g., the trans-Caucasian region) which seems to drop mysteriously between the French generally and French Basques, and the Basque vs. non-Basque Spanish samples. There are also similar, though not necessarily easy to map across the two regions, disjunctions in South Asia between geographically close Indian groups.


Ultimately model-based clustering algorithms and PCA is going to get us only so far. Remember that the clusters generated ...

December 16, 2011

How to reconstruct the Indo-Europeans

As must be obvious, I think now that the spread of Indo-European languages had some demographic impact. It wasn’t analogous to the spread of English to Jamaica, or the existence of French as an official language in Congo-Brazzaville. Because of this, I now believe it is possible in the near future that scientists will reconstruct the genome of the original Indo-Europeans. How?

1) Find the intersection of genetic segments on the chromosomal level which share identity-by-descent between widely separated Indo-European groups. For example, Greeks, Swedes, and Punjabis.

2) Check to see which of these intersecting elements is not found in nearby non-Indo-European groups. For example, Basques, Finns, and non-Brahmin South Indian Dravidian speakers. At least to an appreciable frequency.

My current supposition is that proportionally this component won’t be preponderant in most places, but, it will be significant. By reconstructing an Indo-European genome we may actually have the ability to ascertain the population’s urheimat, as we can compare its genetic distance to extant populations.

June 27, 2011

First Farmers Facing the Ocean

The image above is adapted from the 2010 paper A Predominantly Neolithic Origin for European Paternal Lineages, and it shows the frequencies of Y chromosomal haplogroup R1b1b2 across Europe. As you can see as you approach the Atlantic the frequency converges upon ~100%. Interestingly the fraction of R1b1b2 is highest among populations such as the Basque and the Welsh. This was taken by some researchers in the late 1990s and early 2000s as evidence that the Welsh adopted a Celtic language, prior to which they spoke a dialect distantly related to Basque. Additionally, the assumption was that the Basques were the ur-Europeans. Descendants of the Paleolithic populations of the continent both biologically and culturally, so that the peculiar aspects of the Basque language were attributed by some to its ancient Stone Age origins.

As indicated by the title the above paper overturned such assumptions, and rather implied that the origin of R1b1b2 haplogroup was in the Near East, and associated with the expansion of Middle Eastern farmers from the eastern Mediterranean toward western Europe ~10,000 years ago. Instead of the high frequency of R1b1b2 being a confident peg for the ...

March 27, 2010

The science of human history as written by Herodotus

The following passage is from the epilogue of The Real Eve: Modern Man’s Journey Out of Africa by Stephen Oppenheimer:

In this book I have offered a synthesis of genetic and other evidence. Everything points to a single southern exodus from Eritrea to the Yemen, and to all the non-African male and female gene lines having arisen from their respective single out-of-Africa founder lines in South Asian (or at least near the southern exit). I regard the genetic logic for this synthesis as a solid foundation, and I have based the rest of my reconstruction of the human diaspora upon it. Obviously, the ‘choice’ of starting point (mine or theirs) determined all the subsequent routes our ancestors and cousins took. Tracing the onward trails is only possible as a result of marked specificity in regional distribution of the genetic branches The geographic clarity of both male and female gene trees is a big departure from the fuzzy inter-regional picture shown by older genetic studies. The degree of segregation of lines into different countries and continents is in itself good evidence that once they got to their chosen new homes, the pioneers generally stayed put, at least until the Last Glacial maximum forced some of them to move. This conservative aspect of our genetic prehistory also provides a partial explanation for the fact that when we look at a person, we can usually tell, to the continent, where their immediate ancestors came from, and underlies differences that some of us still call ‘race.’

Oppenheimer wrote the above in the early aughts, as his book was published in 2003. Much of this is generally in line with the ‘orthodoxy’ of the day. I believe that Oppenheimer’s assertion that there was one southern migration out of Africa by anatomically modern humans has gained some advantage over the alternative model of two routes, northern and southern, over the past ten years (Spencer Wells’ The Journey of Man sketches out the two wave model). Other assertions and assumptions have not stood the test of time. In particular, I would contend that generally the ‘conservative aspect of our genetic prehistory’ can no longer be taken for granted. Specifically, it seems likely now that much occurred after the Ice Age and during the Neolithic.


420px-AGMA_HérodoteThe false inferences of the early aughts were due to two primary problems. First, they relied heavily on the powerful new techniques of extraction and analysis of uniparental ineages; the male and female direct line of descent. Concretely, mtDNA and the nonrecombintant portion of the Y chromosome. The lack of recombination allows for relatively easy reconstruction of phylogenies assuming a coalescent model. Second, the inferences attempt to make connections between the patterns of variation in modern populations, and what one may infer about the past from those patterns. Obviously constructing a phylogeny, or plotting haplogroup frequencies as a function of geography, is rather straightforward science. But using these results to generate inferences of the past is often more of an art than a science, and implicit assumptions lurk behind the causal chains. Consider for example the utilization of modern Anatolian (i.e., Turkish) genetic variation as a reference for the expansion into Europe of Neolithic farmers from the Near East. This of course presumes that modern Anatolians are a good proxy for ancient Anatolians. There are various suggestive reasons for why this is a plausible assumption, but assemble enough plausible assumptions, and rely on their joint likelihood, and you construct a very rickety machinery of possibility.

In early 2007 I began to have serious doubts about the orthodoxy of genetic conservatism. The primary trigger was the story of the Etruscans. Here is the crux of the issue: there are two models for the origins of the Etruscans, first, that they were the pre-Indo-European autochthons of Italy, or, that they were the migrants from the eastern Mediterranean, in particular Anatolia. The second may seem an outlandish hypothesis, but there were several tendrils of evidence to support it. But perhaps the ’support’ which weighed most against it is that the fact that the Anatolian model has an ancient source, the Greek historian Herodotus. I should perhaps put historian in quotes as well, because Herodotus is often viewed more as a repeater of myths, and derided by some as the ‘father of lies’ (in this he stands in sharp contrast to contemporary perceptions of the ‘modern’ Thucydides, though revisionists have begun to challenge this narrative). In contrast, the model that Etruscans are indigenous to Italy, and that their ‘exotic’ foreign traits were simply acquired through trade and cultural diffusion, dovetailed well with the post-World War II ‘pots not peoples’ paradigm. That cultural change was ubiquitous, while at the same time populations were immobile. It was boring, prosaic, and conservative, and so an ideal null hypothesis.

But here it turns out that Herodotus was right, and archaeologists were wrong. Genetic analysis of modern Tuscans from isolated villages shows that some are surprisingly closely related to extant eastern Mediterranean lineages. Genetic analysis of Tuscan cattle showed that they were surprisingly closely related to extant eastern Mediterranean lineages of cattle. Finally, extraction of ancient Etruscan DNA showed that they were closely related to extant eastern Mediterranean lineages. The overlap was often with Anatolia, and combined with fragmentary linguistic and archaeological data, the evidence clearly points to an exogenous origin for the Etruscans. The boring null hypothesis was wrong. After these genetic stories gained prominence I went and reread recent archaeological texts on the Etruscans, and there were many models which showed exactly how Etruscan cultural uniqueness derived back to prehistoric Italy. It seems in hindsight that the prior assumption served as an interpretative filter, and people saw patterns that they were primed to see based on what they ‘knew’ to be the history of prehistoric and early Iron Age Tuscany.

Of course to refute the primacy of Oppenheimer’s conservative model of genetics one has to offer more examples than that of the Etruscans, and in particular, examples which are of greater scope and weight. I believe those examples exist. In the early aughts based on the mtDNA evidence the likelihood was that South Asian genetic variation is by and large a product of changes wrought upon the basic elements extant in the region around the end of the last Ice Age. The Y chromosomal data was more confused, though it did imply a closer relationship to groups in western Eurasia. But based on the mtDNA Oppenheimer posited a model whereby India was the mother of all non-Africans, that is, all non-African lineages derived from roots within the Indian subcontinent before the Last Glacial Maximum. This is at sharp variance with colonialist narratives of an Aryan invasion of the subcontinent, and the subjugation of the natives by quasi-European overlords, who are the ancestors of the moder upper castes. The charged ideological import of this model is transparently obvious.

Unfortunately the reality is likely more complex. I suspect that some form of Oppenheimer’s model is correct, insofar as South Asia was likely an important way station for modern humans as they left Africa, and pushed into other regions of Eurasia, on to Australasia and the New World. This interpretation does gain support from mtDNA, the direct maternal lineage. But a new analysis of South Asian genetic variation using a substantial proportion of the autosomal genome implies in fact that South Asians are possibly descendants of an ancient hybridization event between a native population with deep roots in the subcontinent, and a quasi-European population which was exogenous to the subcontinent.* Genetically the quasi-European population is quite close to northern Europeans, similar to the genetic distance between modern Finns and Italians, not trivial, but far closer than that between modern South Asians and Europeans. Was this the ancient Aryan invasion? I remain skeptical of this particular detail for various reasons, as I suspect that the history of the Indian subcontinent is in fact even more complex than has been assumed before (I think it is more likely that the quasi-Europeans came before the Indo-Aryans, who arrived late, and had a stronger cultural than genetic influence).

Finally, there is another region of the world where it seems likely that the old orthodoxies of genetic conservatism will be overthrown. That region is Europe. The scientific orthodoxy of deep time continuity is strong enough that it has percolated into the public consciousness, the leader of the British National Party even referred to the deep roots of white British in demarcating who he believed ‘indigenous people’ of the Isles were. But newer data is more supportive of the hypothesis that in fact Neolithic farmers who arrived from elsewhere are the likely ancestors of most Europeans, not the hunter-gatherers who remained after the Ice Age. Extraction of ancient DNA has yielded a set of results which simply are not explicable assuming the older models of genetic continuity, which were based on inferences made from modern population variation. If I had to hazard a guess, I would have some, though not high, confidence in the following story. First, the indigenous hunter-gatherers are assimilated or marginalized by waves of Neolithic farmers pushing out from the eastern Mediterranean. The demographic expansion does not necessarily sweep outward along a southeast-northwest axis, rather, it follows the Mediterranean and Atlantic fringes, as well as along river systems in the interior. Its impact is weakest in the northeast of Europe, where Middle Eastern crops are least suitable, and the natives have the most time to absorb the cultural toolkit of the newcomers so as to resist their advance. Second, and far later, there was another wave pushing out from the region of the Ukraine to the Volga, likely the ancestors of the Indo-Europeans. Tentatively I would contend that these were the carriers of the Kurgan culture, and also brought the allele for lactase persistence. Again, for ecological reasons the populations of the northeast Baltic and into the forests of northern Russia were most insulated from this push (and non-Indo-European languages persisted in Iberia down to Roman times, and specifically in the Basque-country down to modern times, though I suspect this is a function of distance). So modern European populations may be assumed to be tri-hybrid, first a synthesis of Middle Eastern farmers overlain upon the Paleolithic substrate, and second a synthesis of Indo-Europeans from the east overlain upon pre-Indo-European substrate. Unlike the case of India I suspect teasing out these patterns in modern populations is more difficult because the genetic distance between the three ancestral populations is far smaller than between the indigenous peoples of India before the quasi-Europeans arrived.

This leaves much of the world untouched by my speculations, but I believe showing that the genetically conservative null hypothesis is now in serious doubt in South Asia and Europe is sufficient to knock it from being a necessarily default assumption through which we must filter our interpretations. I do not believe that the reordering of human variation and the welter of population movement after the Ice Age was equivalent in effect to the Out of Africa migration, but I do believe that it was important enough to make the world of 2000 BCE very different from that of 15000 BCE in regards to genetic variation. In some cases, such as Central Asia from the Caspian to the Taklamakan the world of 2000 CE is fundamentally different from the world of 0 CE.

I will then end with a prediction, one in which I do not have much confidence, but which may no longer be wrong on the face of it with these new data in mind. Here is a passage from page 7 of Jared Diamond’s Guns, Germs, and Steel:

Initially, archaeologists considered the possibility that the colonization of Australia/New Guinea was achieved accidentally by just a few people swept to sea while fishing on a raft near an Indonesian island. In an extreme scenario the first settlers are pictured as having consisted of a single pregnant young woman carrying a male fetus…..

Let me stipulate that Diamond seems skeptical of the extreme model, but it illustrates the consensus that Australian Aboriginal populations are descended from the first settlers. That is, the modern populations of indigenous Australians are the direct descendants of those who swept Out of Africa along the fringe of the Indian ocean, through Southeast Asia, and arrived in Australia (more specifically, Sahul), on the order of 40 to 60 thousand years ago. From what genetic data I have seen this may be true. But I do not know of any extractions of ancient DNA, and it seems to me that the analysis of the phylogenetics of Australian Aboriginals is relatively sketchy. Therefore, I will suggest that within the last 10,000 years there has been a major new migration of people into Australia, and the modern range of genetic variation of Australian Aboriginals is significantly different from that of the populations of the Ice Age. I suggest this primarily because the dingo arrived within the last 10,000 years, more likely as recently as 4,000 years ago. With the expansion of the utility of ancient DNA extraction and analysis this question may be answered in the near future. I would still bet I’m wrong with the hypothesis I just offered, but I’m far less sure than I would have been 2 years ago.

Note: This post emerged from a conversation I had with Kevin Zelnio and Dave Munger.

* I say ‘quasi-European’ because the population may have origins outside of the boundaries of modern Europe at the Urals. Perhaps in western Siberia. Additionally, the idea of ‘Europe’ is relatively new, and exhibits little ancient cultural coherency.

Image source: Wikipedia

Powered by WordPress