Razib Khan One-stop-shopping for all of my content

September 10, 2012

The West Asian mix

Filed under: Genetics,Indo-Europeans — Razib Khan @ 7:42 am

IE-speaking West Europeans are West Asian-admixed relative to Non-IE speaking Basques. Dienekes explicitly confirms what seems obvious using ADMIXTURE. When I get a chance I’m going to see if this difference is evident when comparing some South Indian (non-Brahmin samples) I have against Gujaratis. For what it’s worth I am told that ADMIXTOOLS will be out this week.

August 16, 2012

Rise of the planet of the Indo-Europeans

Filed under: Anthroplogy,History,Indo-Europeans — Razib Khan @ 9:00 am

In response to my post below a friend emailed me the above sentence. As I suggest below it sounds crazy, and I don’t know if I believe it. But here’s an abstract from the Reich lab from June:

Estimating a date of mixture of ancestral South Asian populations

Linguistic and genetic studies have demonstrated that almost all groups in South Asia today descend from a mixture of two highly divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners and Europeans, and Ancestral South Indians (ASI) not related to any populations outside the Indian subcontinent. ANI and ASI have been estimated to have diverged from a common ancestor as much as 60,000 years ago, but the date of the ANI-ASI mixture is unknown. Here we analyze data from about 60 South Asian groups to estimate that major ANI-ASI mixture occurred 1,200-4,000 years ago. Some mixture may also be older—beyond the time we can query using admixture linkage disequilibrium—since it is universal throughout the subcontinent: present in every group speaking Indo-European or Dravidian languages, in all caste levels, and in primitive tribes. After the ANI-ASI mixture that occurred within ...

July 15, 2012

Continuing the search for Indo-Europeans

Filed under: History,Indo-Europeans — Razib Khan @ 1:53 pm

Dienekes P. is often rather laconic in commentary on the papers he links to, but of late he has “come out of his shell.” He has two posts which are important “weekend reading”:

- Population strata in the West Siberian plain (Baraba forest steppe)

- Hints of East/Central Asian admixture in Northern Europe

I freely admit that much of the conjecture here is above my pay-grade in terms of evaluation. But I do think it’s important think through. My “gut” tends to lean toward a “revenge of the Mesolithic” scenario promoted by some of Dienekes’ critics, but I don’t have a strong position.

July 3, 2012

Has Dienekes Pontikos found the signature of the Indo-Europeans?

Filed under: Anthroplogy,History,Indo-Europeans,Linguistics — Razib Khan @ 12:35 pm

I don’t know the answer to the question posted in title above, and I’m moderately skeptical that he has. But I wanted to give him full credit in the public record if researchers confirm his findings in the next few years. You can read the full post at his weblog, but basically he found that a West Asian modal element in a north British (Orkney) and Lithuanian individual seems to be negatively correlated with a Northwest European modal element and positively correlated with Near Eastern and South Asian components on a genomic level across different models in ADMIXTURE (e.g., does “South Asian” at K = 5 tend to match “West Asian” at K = 8).

Two major concerns:


- I don’t have a good intuition for this method. Could this be an artifact of the algorithm?

- When you have a hypothesis in mind you can unconsciously seek out confirmatory points. As you can see in the comments below Dienekes and his interlocutors have given this issue much thought. Frankly, I found it difficult to follow a lot of the dialogue, and I follow this topic more than most.

It seems that at this point someone should do follow up analyses ...

July 1, 2012

The mystery of the origin of the Indo-Europeans may be solved within the next 2 years

Filed under: Anthroplogy,Indo-Europeans — Razib Khan @ 12:27 pm

Dienekes has a post up, The Bronze Age Indo-European invasion of Europe. The crux of his argument is as such:

But there is another component present in modern Europe, the West_Asian which is conspicuous in its absence in all the ancient samples so far. This component reaches its highest occurrence in the highlands of West Asia, from Anatolia and the Caucasus all the way to the Indian subcontinent. It is well represented in modern Europeans, reaching its minima in the Iberian peninsula….

Thanks to the public release of genetic data Dienekes has developed his theories in part out of his own analyses of said data. Though I’ve run fewer analyses, with smaller data sets, some of the same patterns jump out at me. In particular, there is a component which is modal in northern West Asia (e.g., the trans-Caucasian region) which seems to drop mysteriously between the French generally and French Basques, and the Basque vs. non-Basque Spanish samples. There are also similar, though not necessarily easy to map across the two regions, disjunctions in South Asia between geographically close Indian groups.


Ultimately model-based clustering algorithms and PCA is going to get us only so far. Remember that the clusters generated ...

December 16, 2011

How to reconstruct the Indo-Europeans

As must be obvious, I think now that the spread of Indo-European languages had some demographic impact. It wasn’t analogous to the spread of English to Jamaica, or the existence of French as an official language in Congo-Brazzaville. Because of this, I now believe it is possible in the near future that scientists will reconstruct the genome of the original Indo-Europeans. How?

1) Find the intersection of genetic segments on the chromosomal level which share identity-by-descent between widely separated Indo-European groups. For example, Greeks, Swedes, and Punjabis.

2) Check to see which of these intersecting elements is not found in nearby non-Indo-European groups. For example, Basques, Finns, and non-Brahmin South Indian Dravidian speakers. At least to an appreciable frequency.

My current supposition is that proportionally this component won’t be preponderant in most places, but, it will be significant. By reconstructing an Indo-European genome we may actually have the ability to ascertain the population’s urheimat, as we can compare its genetic distance to extant populations.

June 27, 2011

First Farmers Facing the Ocean

The image above is adapted from the 2010 paper A Predominantly Neolithic Origin for European Paternal Lineages, and it shows the frequencies of Y chromosomal haplogroup R1b1b2 across Europe. As you can see as you approach the Atlantic the frequency converges upon ~100%. Interestingly the fraction of R1b1b2 is highest among populations such as the Basque and the Welsh. This was taken by some researchers in the late 1990s and early 2000s as evidence that the Welsh adopted a Celtic language, prior to which they spoke a dialect distantly related to Basque. Additionally, the assumption was that the Basques were the ur-Europeans. Descendants of the Paleolithic populations of the continent both biologically and culturally, so that the peculiar aspects of the Basque language were attributed by some to its ancient Stone Age origins.

As indicated by the title the above paper overturned such assumptions, and rather implied that the origin of R1b1b2 haplogroup was in the Near East, and associated with the expansion of Middle Eastern farmers from the eastern Mediterranean toward western Europe ~10,000 years ago. Instead of the high frequency of R1b1b2 being a confident peg for the ...

March 27, 2010

The science of human history as written by Herodotus

The following passage is from the epilogue of The Real Eve: Modern Man’s Journey Out of Africa by Stephen Oppenheimer:

In this book I have offered a synthesis of genetic and other evidence. Everything points to a single southern exodus from Eritrea to the Yemen, and to all the non-African male and female gene lines having arisen from their respective single out-of-Africa founder lines in South Asian (or at least near the southern exit). I regard the genetic logic for this synthesis as a solid foundation, and I have based the rest of my reconstruction of the human diaspora upon it. Obviously, the ‘choice’ of starting point (mine or theirs) determined all the subsequent routes our ancestors and cousins took. Tracing the onward trails is only possible as a result of marked specificity in regional distribution of the genetic branches The geographic clarity of both male and female gene trees is a big departure from the fuzzy inter-regional picture shown by older genetic studies. The degree of segregation of lines into different countries and continents is in itself good evidence that once they got to their chosen new homes, the pioneers generally stayed put, at least until the Last Glacial maximum forced some of them to move. This conservative aspect of our genetic prehistory also provides a partial explanation for the fact that when we look at a person, we can usually tell, to the continent, where their immediate ancestors came from, and underlies differences that some of us still call ‘race.’

Oppenheimer wrote the above in the early aughts, as his book was published in 2003. Much of this is generally in line with the ‘orthodoxy’ of the day. I believe that Oppenheimer’s assertion that there was one southern migration out of Africa by anatomically modern humans has gained some advantage over the alternative model of two routes, northern and southern, over the past ten years (Spencer Wells’ The Journey of Man sketches out the two wave model). Other assertions and assumptions have not stood the test of time. In particular, I would contend that generally the ‘conservative aspect of our genetic prehistory’ can no longer be taken for granted. Specifically, it seems likely now that much occurred after the Ice Age and during the Neolithic.


420px-AGMA_HérodoteThe false inferences of the early aughts were due to two primary problems. First, they relied heavily on the powerful new techniques of extraction and analysis of uniparental ineages; the male and female direct line of descent. Concretely, mtDNA and the nonrecombintant portion of the Y chromosome. The lack of recombination allows for relatively easy reconstruction of phylogenies assuming a coalescent model. Second, the inferences attempt to make connections between the patterns of variation in modern populations, and what one may infer about the past from those patterns. Obviously constructing a phylogeny, or plotting haplogroup frequencies as a function of geography, is rather straightforward science. But using these results to generate inferences of the past is often more of an art than a science, and implicit assumptions lurk behind the causal chains. Consider for example the utilization of modern Anatolian (i.e., Turkish) genetic variation as a reference for the expansion into Europe of Neolithic farmers from the Near East. This of course presumes that modern Anatolians are a good proxy for ancient Anatolians. There are various suggestive reasons for why this is a plausible assumption, but assemble enough plausible assumptions, and rely on their joint likelihood, and you construct a very rickety machinery of possibility.

In early 2007 I began to have serious doubts about the orthodoxy of genetic conservatism. The primary trigger was the story of the Etruscans. Here is the crux of the issue: there are two models for the origins of the Etruscans, first, that they were the pre-Indo-European autochthons of Italy, or, that they were the migrants from the eastern Mediterranean, in particular Anatolia. The second may seem an outlandish hypothesis, but there were several tendrils of evidence to support it. But perhaps the ’support’ which weighed most against it is that the fact that the Anatolian model has an ancient source, the Greek historian Herodotus. I should perhaps put historian in quotes as well, because Herodotus is often viewed more as a repeater of myths, and derided by some as the ‘father of lies’ (in this he stands in sharp contrast to contemporary perceptions of the ‘modern’ Thucydides, though revisionists have begun to challenge this narrative). In contrast, the model that Etruscans are indigenous to Italy, and that their ‘exotic’ foreign traits were simply acquired through trade and cultural diffusion, dovetailed well with the post-World War II ‘pots not peoples’ paradigm. That cultural change was ubiquitous, while at the same time populations were immobile. It was boring, prosaic, and conservative, and so an ideal null hypothesis.

But here it turns out that Herodotus was right, and archaeologists were wrong. Genetic analysis of modern Tuscans from isolated villages shows that some are surprisingly closely related to extant eastern Mediterranean lineages. Genetic analysis of Tuscan cattle showed that they were surprisingly closely related to extant eastern Mediterranean lineages of cattle. Finally, extraction of ancient Etruscan DNA showed that they were closely related to extant eastern Mediterranean lineages. The overlap was often with Anatolia, and combined with fragmentary linguistic and archaeological data, the evidence clearly points to an exogenous origin for the Etruscans. The boring null hypothesis was wrong. After these genetic stories gained prominence I went and reread recent archaeological texts on the Etruscans, and there were many models which showed exactly how Etruscan cultural uniqueness derived back to prehistoric Italy. It seems in hindsight that the prior assumption served as an interpretative filter, and people saw patterns that they were primed to see based on what they ‘knew’ to be the history of prehistoric and early Iron Age Tuscany.

Of course to refute the primacy of Oppenheimer’s conservative model of genetics one has to offer more examples than that of the Etruscans, and in particular, examples which are of greater scope and weight. I believe those examples exist. In the early aughts based on the mtDNA evidence the likelihood was that South Asian genetic variation is by and large a product of changes wrought upon the basic elements extant in the region around the end of the last Ice Age. The Y chromosomal data was more confused, though it did imply a closer relationship to groups in western Eurasia. But based on the mtDNA Oppenheimer posited a model whereby India was the mother of all non-Africans, that is, all non-African lineages derived from roots within the Indian subcontinent before the Last Glacial Maximum. This is at sharp variance with colonialist narratives of an Aryan invasion of the subcontinent, and the subjugation of the natives by quasi-European overlords, who are the ancestors of the moder upper castes. The charged ideological import of this model is transparently obvious.

Unfortunately the reality is likely more complex. I suspect that some form of Oppenheimer’s model is correct, insofar as South Asia was likely an important way station for modern humans as they left Africa, and pushed into other regions of Eurasia, on to Australasia and the New World. This interpretation does gain support from mtDNA, the direct maternal lineage. But a new analysis of South Asian genetic variation using a substantial proportion of the autosomal genome implies in fact that South Asians are possibly descendants of an ancient hybridization event between a native population with deep roots in the subcontinent, and a quasi-European population which was exogenous to the subcontinent.* Genetically the quasi-European population is quite close to northern Europeans, similar to the genetic distance between modern Finns and Italians, not trivial, but far closer than that between modern South Asians and Europeans. Was this the ancient Aryan invasion? I remain skeptical of this particular detail for various reasons, as I suspect that the history of the Indian subcontinent is in fact even more complex than has been assumed before (I think it is more likely that the quasi-Europeans came before the Indo-Aryans, who arrived late, and had a stronger cultural than genetic influence).

Finally, there is another region of the world where it seems likely that the old orthodoxies of genetic conservatism will be overthrown. That region is Europe. The scientific orthodoxy of deep time continuity is strong enough that it has percolated into the public consciousness, the leader of the British National Party even referred to the deep roots of white British in demarcating who he believed ‘indigenous people’ of the Isles were. But newer data is more supportive of the hypothesis that in fact Neolithic farmers who arrived from elsewhere are the likely ancestors of most Europeans, not the hunter-gatherers who remained after the Ice Age. Extraction of ancient DNA has yielded a set of results which simply are not explicable assuming the older models of genetic continuity, which were based on inferences made from modern population variation. If I had to hazard a guess, I would have some, though not high, confidence in the following story. First, the indigenous hunter-gatherers are assimilated or marginalized by waves of Neolithic farmers pushing out from the eastern Mediterranean. The demographic expansion does not necessarily sweep outward along a southeast-northwest axis, rather, it follows the Mediterranean and Atlantic fringes, as well as along river systems in the interior. Its impact is weakest in the northeast of Europe, where Middle Eastern crops are least suitable, and the natives have the most time to absorb the cultural toolkit of the newcomers so as to resist their advance. Second, and far later, there was another wave pushing out from the region of the Ukraine to the Volga, likely the ancestors of the Indo-Europeans. Tentatively I would contend that these were the carriers of the Kurgan culture, and also brought the allele for lactase persistence. Again, for ecological reasons the populations of the northeast Baltic and into the forests of northern Russia were most insulated from this push (and non-Indo-European languages persisted in Iberia down to Roman times, and specifically in the Basque-country down to modern times, though I suspect this is a function of distance). So modern European populations may be assumed to be tri-hybrid, first a synthesis of Middle Eastern farmers overlain upon the Paleolithic substrate, and second a synthesis of Indo-Europeans from the east overlain upon pre-Indo-European substrate. Unlike the case of India I suspect teasing out these patterns in modern populations is more difficult because the genetic distance between the three ancestral populations is far smaller than between the indigenous peoples of India before the quasi-Europeans arrived.

This leaves much of the world untouched by my speculations, but I believe showing that the genetically conservative null hypothesis is now in serious doubt in South Asia and Europe is sufficient to knock it from being a necessarily default assumption through which we must filter our interpretations. I do not believe that the reordering of human variation and the welter of population movement after the Ice Age was equivalent in effect to the Out of Africa migration, but I do believe that it was important enough to make the world of 2000 BCE very different from that of 15000 BCE in regards to genetic variation. In some cases, such as Central Asia from the Caspian to the Taklamakan the world of 2000 CE is fundamentally different from the world of 0 CE.

I will then end with a prediction, one in which I do not have much confidence, but which may no longer be wrong on the face of it with these new data in mind. Here is a passage from page 7 of Jared Diamond’s Guns, Germs, and Steel:

Initially, archaeologists considered the possibility that the colonization of Australia/New Guinea was achieved accidentally by just a few people swept to sea while fishing on a raft near an Indonesian island. In an extreme scenario the first settlers are pictured as having consisted of a single pregnant young woman carrying a male fetus…..

Let me stipulate that Diamond seems skeptical of the extreme model, but it illustrates the consensus that Australian Aboriginal populations are descended from the first settlers. That is, the modern populations of indigenous Australians are the direct descendants of those who swept Out of Africa along the fringe of the Indian ocean, through Southeast Asia, and arrived in Australia (more specifically, Sahul), on the order of 40 to 60 thousand years ago. From what genetic data I have seen this may be true. But I do not know of any extractions of ancient DNA, and it seems to me that the analysis of the phylogenetics of Australian Aboriginals is relatively sketchy. Therefore, I will suggest that within the last 10,000 years there has been a major new migration of people into Australia, and the modern range of genetic variation of Australian Aboriginals is significantly different from that of the populations of the Ice Age. I suggest this primarily because the dingo arrived within the last 10,000 years, more likely as recently as 4,000 years ago. With the expansion of the utility of ancient DNA extraction and analysis this question may be answered in the near future. I would still bet I’m wrong with the hypothesis I just offered, but I’m far less sure than I would have been 2 years ago.

Note: This post emerged from a conversation I had with Kevin Zelnio and Dave Munger.

* I say ‘quasi-European’ because the population may have origins outside of the boundaries of modern Europe at the Urals. Perhaps in western Siberia. Additionally, the idea of ‘Europe’ is relatively new, and exhibits little ancient cultural coherency.

Image source: Wikipedia

Powered by WordPress