Razib Khan One-stop-shopping for all of my content

May 23, 2019

The Jomon contributed little to the Japanese

Filed under: Historical Population Genetics,Japan — Razib Khan @ 5:00 pm

A few months ago there was a preprint with an ancient Japanese genome, Jomon genome sheds light on East Asian population history. I read it but didn’t say anything at the time. I read it again, partly because I’m reading a history of Korea where the Wa, the early Japanese, show up to intervene in mainland affairs. This cameo made me think more deeply about what happened in Japan several thousand years ago.

The above genome comes from Honshu, and dates to 2,500 years before the present. And yet it’s quite different from modern Japanese! Here is the abstract:

Anatomical modern humans reached East Asia by >40,000 years ago (kya). However, key questions still remain elusive with regard to the route(s) and the number of wave(s) in the dispersal into East Eurasia. Ancient genomes at the edge of East Eurasia may shed light on the detail picture of peopling to East Eurasia. Here, we analyze the whole-genome sequence of a 2.5 kya individual (IK002) characterized with a typical Jomon culture that started in the Japanese archipelago >16 kya. The phylogenetic analyses support multiple waves of migration, with IK002 forming a lineage basal to the rest of the ancient/present-day East Eurasians examined, likely to represent some of the earliest-wave migrants who went north toward East Asia from Southeast Asia. Furthermore, IK002 has the extra genetic affinity with the indigenous Taiwan aborigines, which may support a coastal route of the Jomon-ancestry migration from Southeast Asia to the Japanese archipelago. This study highlight the power of ancient genomics with the isolated population to provide new insights into complex history in East Eurasia.

Let me be frank: the term basal probably is somewhat misleading. Ancient DNA in East Asia is in its infancy. We don’t really know what’s going on, and what went down. But I can offer a guess.

I think in the next few years we will realize that there was a massive demographic expansion out of a few agricultural hearths in the eastern regions of what we now call China during the early Holocene. The Sino-Tibetan peoples, Austro-Asiatic, and Austronesian are derived from this series of expansions. But with hindsight, I think we will see that the peoples to the north and east of China proper, are also downstream of this agricultural revolution.

Within China proper a secondary expansion occurred, dating to the rise of the Chinese civilization. This ‘erased’ a lot of the variation in southern China, and a quasi-panmixia was enforced through the Chinese dynastic cycles, as northerners moved south and vice versa.

In the first figure, the Treemix panel shows that only ~3% of the modern Japanese ancestry needs to be accounted for by an edge from this Jomon genome. This implies that there was massive population replacement of the Jomon by the Yayoi people, from southern proto-Korea. I say “proto-Korea,” because the origins of the ancestors of the modern Korean culture seem to be located much further north, in what is the border region of Manchuria and North Korea today. The Yayoi is quite possibly derived from one of the indigenous peoples of the southern Korean peninsula, who were assimilated by the expanding Koreans as they moved southward around 0 A.D.

Within the preprint, the authors seem to converge on two facts. First, the Ancestral North Eurasian (ANE) admixture into much of East and Southeast Asia was minimal but much more substantial in Siberia. This is entirely plausible, though I think there needs to be more ancient East Asians besides Tianyun to conclude there isn’t some basal fraction The authors suggest that shared drift between Hoabinhian hunter-gatherers in ancient Southeast Asia and the Jomon individual confirm the likely southward origin of modern-day East Asians. I suspect the inference is correct, though I don’t know the shared drift tells us that much without more samples.

In the preprint, the authors observe that this is a much stronger affinity among coastal East Asians to the Jomon sample than from interior peoples. I don’t know what to say about this, except that it seems likely to me that coastal pre-agricultural populations would have greater numbers than interior peoples.

May 16, 2019

What might have happened to earlier modern humans

Filed under: Historical Population Genetics,istorical Population Genetics — Razib Khan @ 2:06 am

A few years ago a paper came out, Ancient gene flow from early modern humans into Eastern Neanderthals, which recorded evidence of ancestry related to modern humans which evolved in Africa in the Altai Neanderthal genome. There has also been evidence that ancestral Neanderthals had their mtDNA lineages replaced by African lineages more than 200,000 years ago. To my knowledge there isn’t much evidence of modern admixture into Denisovans…but we don’t have that many Denisovan genomes, and the ones we have are from a single location.

After thinking about it seems likely that there was a complex sequence of interactions and gene flow between structured Eurasian and African humans over the last million years. The gene flow may have occurred in pulses, and unevenly. There are some suggestive modern human sites before 60,000 years ago in parts of Asia. The timing here is important since this threshold is when modern humans as we know them really began expanding.

So this is the verbal model:

  • Several modern human populations sampled from the ancient African structured modern human population contributed genetically to different archaic Eurasians at different times
  • Some of these modern populations were the ones that left remains or tools which are identified as modern, but “too old”
  • These populations were absorbed into larger archaic Eurasian hominin populations.
  • Later, these archaic Eurasian hominins were absorbed into the last modern wave out of Africa, but their ancestry was well mixed enough into the archaics that they are seen as part of the archaic background

May 6, 2019

The emergence of Han identity as autochthonous

Filed under: China,Historical Population Genetics,Sino-Tibetan — Razib Khan @ 10:00 pm

Reader Matt points me to two new papers on the linguistic phylogenetics of the Sino-Tibetan language families, Dated language phylogenies shed light on the ancestry of Sino-Tibetan and Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic. You should read Matt’s whole comment, but one thing he mentions is that by ~3,000 years ago, individuals who were genetically similar to modern Burmese were already present in the territory of modern Burma. Burmese are quite distinct from Cambodians or Vietnamese because there is a distinct “northern” element, which perhaps resembles Tibetans.*

Matt observes that this means the expansion of agriculture into Southeast Asia occurred through a few pulses in rapid succession, rather than gradually over time, as seems likely the case in Europe and South Asia (“Early European Farmers” to Corded Ware, or Iranian agriculturalists to Central Asian agro-pastoralists). Austro-Asiatic speaking groups pushed out from the highlands of southern China 4-4,500 years ago. Meanwhile, people from further north seem to have pushed into the uplands of the western portion of upland Southeast Asia 500 to 1,000 years after this. Further east, Austronesians were sweeping along the coast and expanding into the maritime fringe.

But I am not intending to talk about Southeast Asia. Rather, I want to focus on China. Or perhaps more precisely the region and cultures that became China. Both the above papers suggest that the diversification of the  Sino-Tibetan languages occurred around ~7,000 years ago. And, that they began expanding from the zone of inland China, the upper Yellow River basin, from the area occupied by the Yangshao culture. This would explain some peculiar genetic facts. First, the northern affinities of Tibeto-Burman groups in northeast India and in Burma itself (which might otherwise require later Tai migrations) mentioned above. But second, about ten years ago when the early work on EPAS1 and high altitude adaptation was done on Tibetans, their genetic relatedness to Han Chinese was surprisingly close! In fact, some estimate of divergence put it as recently as 3,000 years before the present (I think this was an underestimate, but it gets at the qualitative result).

Remember, this was still before the massive swell of publications which have transformed our understanding of the ubiquity of migration, admixture, and demographic expansion in the recent past across the Holocene. Today I suspect the best model to explain these affinities are that people on of modern greater Tibet descend in large part from the “Yangshao Diaspora,” and amalgamated with local peoples, whether indigenous hunter-gatherers of the plateau region or Indo-Europeans moving in from the west.

I’ll spare you the Bayesian phylogenetics, and the cautions about relying too much on lexicons (see a discussion on The Insight for more). But one interesting aspect of the trees generated by both papers is that the Chinese language tends to be basal in its position in relation to most of the other languages, and the region of China itself is relatively low in linguistic diversity. The basal position and the presence of core words which are associated with the millet zone of north-central China indicate that frontier Tibeto-Burman groups separated long ago from the ancestors of the Chinese dialects. The low linguistic diversity of China can be understood as a consequence of the Chinese Empire, which underwent demographic expansion from the Yellow River basin, and assimilated peripheral peoples over time (some of the southeast Chinese dialects suggest substrate influences).

Which brings me to the Yangshao culture and its centrality to the main stem of East Asian history. It flourished until 3000 BC. It was succeeded by the Longshan culture, which persisted until 1900 BC. Finally, the last great prehistoric culture of the region is the Erilitou. Due to both the oracle bones and various astronomical events associated with events recorded in Chinese history, history as such begins in the centuries before 1000 BC.

As you likely know, much of China proper was not Sinicized until after the fall of the Han dynasty. Deep into historical time the Yangzi river valley and Sichuan were culturally liminal to the orbit of the expanding Chinese civilization, but not of it (perhaps like early Macedonia, influenced by the more complex civilization, but still distinct and barbaric). Further south the territory had more in common ethnolinguistically with much of Southeast Asia. The Sinicization of the south that occurred in the period between 0 and 1000 AD was not just cultural, but also demographic. Looking at modern Chinese their genetic variation for the Han is quite a bit lower than many might expect. The Han of provinces such as Guangdong in the far southeast is not closer to the Vietnamese than they are to the Han of the Yellow River plain, despite geographic proximity to the latter (some of this might simply be due to homogenizing gene flow back and forth).

An aspect of East Asian historical demography and genetics that stands out to me is that in China itself there isn’t a very strong signature of admixture between distinct lineages that you see in Europe or South Asia (or even West Asia). By this, I mean the fact that the Uygurs are about 50/50 West and East Eurasian. Or that South Asians are mixed between West Eurasians and an ancient indigenous lineage. Or, that the disparate West Eurasian ancestors of Europeans, and the Basal Eurasian component, were all quite distinct before being threaded together. True, many northern Han have detectable West Eurasian ancestry (5% or less), but I think this can be attributed to Turkic and Mongolic peoples, who have higher fractions of this ancestry (probably from Indo-Europeans that they absorbed).

Modern Chinese show much more affinity to the Devil’s Gates samples from the northeastern border with Russia dated to 7,700 years ago than modern Western Europeans do with people present 7,700 years ago (or modern Indians would with people of a similar date). This may illustrate the particular geographic advantages of the upper Yellow River basin over 4,000 years, from 3000 BC down to 1000 AD, when both Chinese economic and cultural power shifted to the Yangzi river valley**. The arrival of the light chariot after the 2000 BC attests to contacts to the west, but the genetic imprint has always been relatively minor. Contrast this to the vast steppe zone between Pannonia and the Altai, where there were multiple reflux events as peoples migrated in both directions at various peoples.

China is different from the other nations of the Eurasian oikomene. The “Rimland.” Some of this goes very deep, and probably the best understanding has to involve a consideration of the physical and human geography of the region, and its relative isolation from broader forces in Eurasian history.

Many peoples claim to be autochthons. The ancient Athenians for example. But the people of the Han civilization developing in the centuries after 1000 BC in the Yello River basin could likely make a more plausible case of being descendants of local hunter-gatherers who were “always there”, and eventually settled down to farm.

* One of the reasons I am convinced that Bengalis have Tibeto-Burman, and not simply Austro-Asiatic, East Asian ancestry is that this northern component can be seen in modern Bengalis, who are about 5-20% East Asian in ancestry.

** The economic power had shifted by 700 AD, but Xian in the north remained the capital until the fall of the Tang.

March 20, 2019

The population turnover in westernmost Europe over the last 8,000 years

Filed under: Historical Population Genetics,Spain — Razib Khan @ 7:53 pm

The figure above is from The genomic history of the Iberian Peninsula over the past 8000 years. If you had seen something like this five years ago, you’d be gobsmacked. But today this is not atypical, especially in light of the fact that Spain seems to harbor many good sites in relation to the preservation of ancient DNA. In the figure above you see an excellent representation of the different streams of ancestry and settlement within Spain over the last 8,000 years. You can conclude from it, for example, that only a small proportion of the ancestry of modern Spaniards derives from people who were residents of the peninsula during the Pleistocene. Similarly, you can also conclude that a minority, though non-trivial, proportion of the ancestry of modern-day Spaniards derives from people who arrived during Classical Antiquity and the Moorish period.

And, confirming earlier work, the Basques seem to be relatively untouched by these later gene flow events. To some extent, we all knew that, as the Basques were famously exempt from limpieza de sangre, the blood purity laws of medieval Spain. But importantly, the Basques have a substantial amount of ancestry from peoples whose heritage goes back to Central Europe, and to a great extent, the forest-steppe of far eastern Europe. This is a huge change from what was understood fifteen years ago. As the Basques speak a clearly non-Indo-European language, many scholars hypothesized that they were remnants of hunter-gatherer peoples, who had been resident in the Iberian peninsula since the Pleistocene.

But the reality is that the origin of the Basques is likely in the arrival of Near Eastern farmers. The Basques share a strong genetic affinity with the peoples of Sardinia, who are the closest proxies in modern European populations for this group. Importantly, the Basque difference from Sardinians is their much greater proportion of Central European/steppe-like ancestry. How did they get this ancestry?

One of the major results of this paper is that a particular branch of R1b came to dominate Spain around 4,000 years ago. Before this period the dominant Y chromosomal lineages in the Iberian peninsula were those associated with the farmer populations. The frequency of R1b is above 80% in Basque males. This is one reason that earlier scholarship assumed that R1b was associated with European hunter-gatherers (the Basque being the descendants of those people). Today, we know that both branches of R1 seem to have expanded ~4,000 years ago and that the most common lineages in western and southern Eurasia seem to go back to the steppe peoples.

It may be that the Basque language actually derives from the steppe as non-Indo-European peoples expanded along with the Indo-Europeans, adopting similar cultural habits and characteristics. This is not a crazy position. The Magyars, for example, are not Turkic or Indo-European, but they adopted a lifestyle associated earlier and simultaneously with Turkic and Indo-European pastoralists. But let’s set this possibility aside. Another option is that the Basque descend from one of the post-Cardial cultures of southwest Europe. That is, their language has roots in the dialects of the early Anatolian farmers. Unlike other peoples, they absorbed the influx of Indo-Europeans, and culturally assimilated them.

This too is not crazy. But how might they have absorbed the Indo-Europeans? In the paper above they tentatively argue, from some of their results, that the Indo-European influx was more male than female. There are suggestions that Basque society may have had matrilineal aspects. This does not entail that they were “matriarchal,” but rather, that inheritance passed through the maternal line. Matrilineal societies are not necessary pacific. The Iroquois are a case in point. And, they have a natural way of assimilating warbands of alien males: these men could become integrated into the preexistent kinship networks.

How might the rise of R1b lineages have occurred so fast? One could posit those young men with Indo-European fathers may have had connections to hostile Indo-European tribes that their cousins with non-Indo-European fathers lacked. If the Indo-Europeans were patrilineal, as seems likely, and the proto-Basques were matrilineal, then these men would have been well placed to better protect the cultural integrity and political independence of their maternal heritage through connections of their paternal lineage.

I have an explicit model here: the intermarriage of European trappers in the American West with native women. In many cases, the children of these men would be raised within a native context, and so served as a bridge of sorts. And, there is another analogy: the frequency of R1a is quite high in some non-Indo-European groups in South Asia. It will turn out, I believe, that Southern Europe and India share many similarities, as the Indo-Europeans encountered people in these regions with rich and complex societies.

Several years ago, A recent bottleneck of Y chromosome diversity coincides with a global change in culture, was published. The authors note there was an explosive growth several Y chromosomal lineages, including R1b and R1a, on the order of 4,000 years ago. Recently the evolutionary anthropologist Joe Henrich stated that “Religion is a technology for scaling up human societies.” With this in mind, I will state here that patriarchy is a technology for swallowing up human societies. The distribution of Y chromosomal lineages associated with early Indo-European extends outside of the boundaries of Indo-European languages. In fact, the expansion of I1, concordant with R1b, suggests that non-Indo-European lineages were assimilated into expanding Indo-European groups.

There is, of course, a debate whether this expansion was violent or not. I suggest above a way in which Indo-European lineages, at least by origin, could become pervasive in a non-Indo-European society. But, it does seem to more plausible that more direct forms of marginalization were likely. In a pre-modern environment not far from the Malthusian limit it wouldn’t take much for certain male lineages to replace themselves, while others to die out. The descent from antiquity project in Europe is difficult because there does seem to have been an elite paternal lineage rupture with the fall of Rome. Many modern noble families are traceable to the centuries after the fall of Rome, but none of them clearly are linked to before the fall of Rome. This does not mean that there was a massacre of those lineages, but that elite lineages which lost their rents would quickly lose their status.

I do think what we call war was part of the expansion. But war was likely simply one of the many manifestations of the power of rise of these bands of brothers.

February 23, 2019

“Out of Africa” in 2019

Filed under: Historical Population Genetics,Out-of-Africa,PremiumPost — Razib Khan @ 9:32 pm

The figure to the left is from Paleolithic DNA from the Caucasus reveals core of West Eurasian ancestry. It is a graph which captures general features of human population historical relationships as we understand them today. Or at least the model fits the data (remember, many models may fit the data!). The graph is complex…but even within the text of the preprint, the author admits that it is characterized by simplifying assumptions, which nevertheless are informative of some general dynamics and processes (e.g., pulse admixtures).

To some extent, the whole last generation or so has been characterized by the victory of a simplifying assumption that captures general truths about the past, with the accumulation of modifications on the margins as more nuanced results enter the picture. The simplifying assumption I am talking about here is the “out of Africa” 50,000 years ago with a total replacement of all other human lineages framework.

By the last quarter of the 20th century, a combination of archaeological and genetic evidence pointed to the likelihood of a massive bottleneck and expansion of humans outside of Africa in the relatively recent past. In the pre-genomic era, the tools were coarse, from uniparental lineages, classical markers, microsatellites, morphometric analyses, as well as archaeological surveys. But, they strongly pointed to massive expansion and population turnover ~50,000 years ago. This, combined with a line of thinking which suggested that Neanderthals were “evolutionary dead-ends” led to the thesis that there was a total replacement.

To a great extent, this model seems to hold up in the broad sketch. But not to an absolute and total degree. Some paleoanthropologists and geneticists were pointing out for decades that the tools we had could not exclude the possibility of admixture at lower fractions with earlier lineages in Eurasia on purely statistical grounds. These scholars were correct, as it turns out. There is now high confidence that in the range of 1-5% of the ancestry of non-Africans derives from highly diverged “archaic” lineages, Neanderthals and Denisovans. The fraction is low enough that more coarse methods did not definitively pick them up, and without ancient genomes, the “game of inference” was not dispositive in either direction. This, despite the fact that these Eurasian hominins’ ancestors seem to have diverged from those of modern humans ~750,000 years ago. Ultimately, scientists needed a physical ancient genome which they could compare to modern populations to come to this conclusion (before the Denisovan result, scientists had been noticing anomalies in Oceanian data for a decade or so but generally ignored it as beneath comment…a presentation was given an anthropology conference on archaic admixture in Oceania right before the Denisova cave paper).

The second major issue is that the massive expansion and bottleneck that occurred ~50,000 years ago may not explain all of the remaining ancestry that is not “archaic.” That is, there were many modern human lineages present 50,000 years ago. The major lacunae in the current model is a huge one: populations within Sub-Saharan Africa maintained larger population sizes throughout this event. And, anatomically modern humans predate this expansion by hundreds of thousands of years. From an archaeological perspective, a lower limit is 200,000 years ago, and an upper limit probably exceeds 300,000 years ago. Additionally, there are “deep lineages” within Africa which clearly predate the expansion 50,000 years ago. There is a strong consensus that the Khoisan people have at least some substantial ancestry that diverged more than 150,000 years ago from other humans, and tentative suggestions from several different research groups suggest that there are even more “basal” (deep divergence) lineages in parts of West Africa that the component within the Khoisan.

This does not even address the likelihood that some “archaic” ancestry persists within Sub-Saharan Africa just as it does outside of Sub-Saharan Africa.

The third issue are diverged anatomically modern humans in eastern Eurasia. By this, I mean lineages which are in a clade with anatomically modern humans rather than Neanderthals and Denisovans, but, split off from other non-Africans before the massive expansion of ~50,000 years before the present. There are two major points here. First, the circumstantial evidence that these people existed is very strong now. We know this from archaeological sites in Southeast Asia and possibly China. And, there is genomic evidence that a lineage closer to modern humans contributed ancestry to Altai Neanderthals ~100,000 years ago. Second, there is some suggestive, though highly disputed, evidence of low levels of earlier-than-50,000-years-ago modern human ancestry in Papuans. The fractions are low, just as they are with Neanderthals. A major problem with detecting diverged lineages at low fractions without ancient DNA is that the statistical power is not there to be definitive. These people are far closer to the dominant “50K expansion” group than Neanderthals and Denisovans.

Remember, without ancient genomes, many geneticists would probably be skeptical to this day about Neanderthal admixture, though there had been suggestions from the genome-wide data of archaic admixture by the middle years of the 2000s. The Neanderthal genome obviously changed our priors, but without ancient genomes, we don’t have as much ability to differentiate possible models which fit the data we have in a manner which resolves disagreements from what I can tell when we are talking about low levels of admixture.

Finally, backing up to the figure at the top of this post: notice that there is now a model with two nestings of “basal” populations in relation to the major non-African ancestry component that expanded ~50,000 years ago. That is, there are famous “Basal Eurasians”, but Lazaridis et al. are proposing a “Basal North African” clade, which diverged from Basal Eurasians & the major Eurasian components before the latter two split apart. Lazaridis et al. also propose ~10 percent of Yoruba ancestry is from the Basal North African clade. There has been a lot of talk for years that Yoruba was an unfortunate choice for “unmixed Sub-Saharan African”, because there are clearly some Neanderthal alleles in this population, indicative of some “back-migration.” But in fact, from modern populations, it would probably be impossible to pick an “unmixed Sub-Saharan African” because there aren’t any.

As I make clear in my previous post, there is a fair amount of evidence that modern Sub-Saharan Africans have been impacted by gene flow from outside of Sub-Saharan Africa. Or at least a population which is somehow closely connected to the ancestors of non-Africans. A substantial proportion of this is probably due to the major modern human expansion that dates to 50,000 years ago. But, some if it probably predates this period, and, some of it postdates this period. If Lazaridis et al. are correct that there is Basal North African admixture in West Africa, then it may actually predate the migration 50,000 years ago because this lineage separated in the period between 50 and 100,000 years ago (of course the admixture could have been later) from other North African-West Asians. And, we happen to know of at least once Holocene era migration into this region of Africa that was prehistoric.

Circa 2010 we had a simple story. The Omo find in Ethiopia was the earliest anatomically modern human, dated to 200,000 years ago. Around 50,000 years ago anatomically modern humans left East Africa and replaced everyone else, in Africa, and outside of Africa. Recent human evolution proceeded from Africa, to the Near East, and then outward in all directions.

I believe this story is not fundamentally flawed in a broad sense. If we ever discover ancient DNA that dates to before 50,000 years ago in East Africa, I think we will find that most modern East Africans share substantial, perhaps even preponderant, ancestry from the expansion that dates to 50,000 years ago. But, the high diversity of extant Sub-Saharan Africans and the suggestions of deeply basal lineages through statistical inference indicate that “deep structure” of modern humans within Africa was not erased by migration from the north or east, but that the combination of the two led to the emergence of African lineages as part of the dynamic process of admixture ~50,000 years ago.

If Basal Eurasians and Basal North Africans, are discovered, they too are part of the story of “deep structure.” And, the earlier modern lineages that were present in eastern Eurasia may also be part of the “deep structure,” though population turnover may also have erased that imprint by the present.

Twenty years ago many scholars conceived of a model where a very small founding population in East Africa expanded and replaced all other humans in “blitzkrieg” fashion. There was massive radiation into Eurasia and Oceania ~50,000 years ago from a small ancestral population, while sister modern lineages in Africa were also replaced. This is still evident in indications of a massive bottleneck, and shared Neanderthal ancestry, in groups from Northwest Eurasia to the New World to Australia. But, there was also a lot of older “deep structure” that was absorbed and integrated. Some of this is easy to pick up because it was so different. Neanderthals. Denisovans. But some of them, like the Basal Eurasians and Basal North Africans, were likely part of the broad family of modern human lineages that were developing in concert for hundreds of thousands of years, a great fanlike phylogenetic tree descended from common ancestors. Further south in Africa there were probably other modern groups, whose phylogenetic relatedness would be a function of their distance to the proto-North African-West Asians. Out of the broad radiation of “modern humans” one group contributed disproportionately to the ancestry of most humans today. More in East Asia than in Europe, and more in Europe than in the Near East, and more in the Near East than Africa.

As we are in a stage of greater complexification errors and mistakes will be made. The evaluation of the models are only as good as the data we have. More data will come.

Addendum: I have not specified where the dominant modern signal is coming from. I think the candidates are probably the Levant, Arabia, all of Northern Africa, and Eastern Africa. Without far more ancient DNA in these regions we may never know.

January 21, 2019

David Reich strikes back!

Filed under: David Reich,Historical Population Genetics — Razib Khan @ 8:03 pm
David Reich submits Five Corrections to The New York Times. As you know, in the fact-checking process I was sent more than 100 statements of which a very high proportion (more than half) were incorrect. For example, as I mentioned to you in my letter of January 7, 20 of 49 statements presented to me […]

January 20, 2019

David Reich drops the mic

Filed under: David Reich,Historical Population Genetics — Razib Khan @ 12:04 pm
Didn’t mean to post so much about that crappy piece in The New York Tines Magazine. But there’s so much tendentious crap in it. That being said, I am probably not going to post much more on this, because David Reich’s response is up: Letter in response to Jan. 17 article in The New York Times […]

January 17, 2019

The ancient DNA oligopoly and the stories people tell about David Reich

Filed under: David Reich,Historical Population Genetics — Razib Khan @ 1:16 pm
There is a very long piece in The New York Times Magazine, Is Ancient DNA Research Revealing New Truths — or Falling Into Old Traps?. It’s the talk of DNA-Twitter for obvious reasons. The very fact that you have a long piece in The New York Times Magazine on this topic means that David Reich is […]

October 22, 2018

The phylogenetic trees falling on the tundra

Filed under: Historical Population Genetics,Population genetics,Siberia — Razib Khan @ 9:59 pm

A massive new ancient DNA preprint just dropped, The population history of northeastern Siberia since the Pleistocene:

…Here, we report 34 ancient genome sequences, including two from fragmented milk teeth found at the ~31.6 thousand-year-old (kya) Yana RHS site, the earliest and northernmost Pleistocene human remains found. These genomes reveal complex patterns of past population admixture and replacement events throughout northeastern Siberia, with evidence for at least three large-scale human migrations into the region. The first inhabitants, a previously unknown population of “Ancient North Siberians” (ANS), represented by Yana RHS, diverged ~38 kya from Western Eurasians, soon after the latter split from East Asians. Between 20 and 11 kya, the ANS population was largely replaced by peoples with ancestry from East Asia, giving rise to ancestral Native Americans and “Ancient Paleosiberians” (AP), represented by a 9.8 kya skeleton from Kolyma River. AP are closely related to the Siberian ancestors of Native Americans, and ancestral to contemporary communities such as Koryaks and Itelmen. Paleoclimatic modelling shows evidence for a refuge during the last glacial maximum (LGM) in southeastern Beringia, suggesting Beringia as a possible location for the admixture forming both ancestral Native Americans and AP. Between 11 and 4 kya, AP were in turn largely replaced by another group of peoples with ancestry from East Asia, the “Neosiberians” from which many contemporary Siberians derive. We detect additional gene flow events in both directions across the Bering Strait during this time, influencing the genetic composition of Inuit, as well as Na Dene-speaking Northern Native Americans, whose Siberian-related ancestry components is closely related to AP. Our analyses reveal that the population history of northeastern Siberia was highly dynamic, starting in the Late Pleistocene and continuing well into the Late Holocene. The pattern observed in northeastern Siberia, with earlier, once widespread populations being replaced by distinct peoples, seems to have taken place across northern Eurasia, as far west as Scandinavia.

The preprint is very interesting and thorough, and the supplements are well over 100 pages. I read the genetics and linguistics portions. They make for some deep reading, and I really regret making fun of Iosif Lazaridis’ fondness for acronyms now.

I will make some cursory and general observations. First, the authors got really high coverage (so high quality) genomes from the Yana RS site. Notice that they’re doing more data-intense analytic methods. Second, they did not find any population with the affinities to Australo-Melanesian that several research groups have found among some Amazonians. Likely they are hiding somewhere…but the ancient DNA sampling is getting pretty good. We’re missing something. Third, I am not sure what to think about the very rapid bifurcation of lineages we’re seeing around ~40,000 years ago.

The ANS population, ancestral by and large to ANE, seems to be about ~75% West Eurasian (without much Basal Eurasian) and ~25% East Eurasian. Or at least that’s one model. Did they then absorb other peoples? Or, was there an ancient population structure in the primal ur-human horde pushing out of the Near East? That is, are the “West Eurasians” and “East Eurasians” simply the descendants of original human tribes venturing out of Africa ~50,000 years ago? Also, rather than discrete West Eurasian and East Eurasian components, perhaps there was a genetic cline where the proto-ANS occupied a position closer to the former, as opposed to some later pulse admixture?

Without more ancient DNA we probably won’t be able to resolve the various alternative models.

October 3, 2018

Nomads, cosmopolitan predators, and peasants, xenophobic producers

Ten years ago when I read Peter Heather’s Empires and Barbarians, its thesis that the migrations and conquests of the post-Roman period were at least in part folk wanderings, where men, women, and children swarmed into the collapsing Empire en masse, was somewhat edgy. Today Heather’s model has to a large extent been validated. The recent paper on the Lombard migration, the discovery that the Lombards were indeed by and large genetically coherent as a transplanted German tribe in Pannonia and later northern Italy, confirms the older views which Heather attempted to resurrect. Additionally, the Lombards also seem to have been defined by a dominant group of elite male lineages.

Why is this even surprising? Because to a great extent, the ethnic and tribal character of the post-Roman power transfer between Late Antique elites and the newcomers was diminished and dismissed for decades. I can still remember the moment in 2010 when I was browsing books on Late Antiquity at Foyles in London and opened a page on a monograph devoted to the society of the Vandal kingdom in North Africa. The author explained that though the Vandals were defined by a particular set of cultural codes and mores, they were to a great extent an ad hoc group of mercenaries and refugees, whose ethnic identity emerged de novo on the post-Roman landscape.

In the next few years, we will probably get Vandal DNA from North Africa. I predict that they will be notably German (though with admixture, especially as time progresses). Additionally, I predict most of the males will be haplogroup R1b or I1. But the Vandal kingdom was actually one where there was a secondary group of barbarians: the Alans. It was Regnum Vandalorum et Alanorum. I predict that Alan males will be R1a. In particular, R1a1a-z93.

But this post is not about the post-Roman world. Rather, it’s about the Inner Asian forest steppe. The sea of grass, stretching from the Altai to the Carpathians. A new paper in Science adds more samples to the story of the Sbruna, Cimmerians, Scythians, and Sarmatians. Ancient genomes suggest the eastern Pontic-Caspian steppe as the source of western Iron Age nomads. The abstract is weirdly nonspecific, though accurate:

For millennia, the Pontic-Caspian steppe was a connector between the Eurasian steppe and Europe. In this scene, multidirectional and sequential movements of different populations may have occurred, including those of the Eurasian steppe nomads. We sequenced 35 genomes (low to medium coverage) of Bronze Age individuals (Srubnaya-Alakulskaya) and Iron Age nomads (Cimmerians, Scythians, and Sarmatians) that represent four distinct cultural entities corresponding to the chronological sequence of cultural complexes in the region. Our results suggest that, despite genetic links among these peoples, no group can be considered a direct ancestor of the subsequent group. The nomadic populations were heterogeneous and carried genetic affinities with populations from several other regions including the Far East and the southern Urals. We found evidence of a stable shared genetic signature, making the eastern Pontic-Caspian steppe a likely source of western nomadic groups.

The German groups which invaded the Western Roman Empire were agropastoralists. That is, they were slash and burn farmers who raised livestock. Though they were mobile, they were not nomads of the open steppe. Man for man the Germans of Late Antiquity had more skills applicable to the military life than the Roman peasant. This explains in part their representation in the Roman armed forces in large numbers starting in the 3rd century. But the people of the steppe, pure nomads, were even more fearsome. Ask the Goths about the Huns.

Whole German tribes, like the Cimbri, might coordinate for a singular migration for new territory, but for the exclusive pastoralist, their whole existence was migration. Groups such as the Goths and Vandals might settle down, and become primary producers again, but pure pastoralists probably required some natural level of predation and extortion upon settled peoples to obtain a lifestyle beyond marginal subsistence. Which is to say that some of the characterizations of Late Antique barbarians as ad hoc configurations might apply more to steppe hordes.

There has been enough work on these populations over the past few years to admit that various groups have different genetic characteristics, indicative of a somewhat delimited breeding population. But, invariably there are outliers here and there, and indications of periodic reversals of migration and interactions with populations from other parts of Eurasia.

Earlier I noted that Heather seems to have been correct that the barbarian invasions of the Roman Empire were events that involved the migration of women and children, as well as men. The steppe was probably a bit different. Here are the Y and mtDNA results for males from these data that are new to this paper:

Culture MtDNA Haplogroup Y Haplogroup
Late Sarmatian U5b2b R1b1a1a2?
Scythian U5a2a1 R1b1a1a2?
Late Sarmatian D4q R1b1a1a2
Scythian J2b1a6 R1b1a1a2
Scythian U5a1a1 R1b1a1a2
Scythian U5b2a3 R1b1a1a2
Scythian U4* R1b1a1a2
Scythian U5a2b R1b1a1a2
Cimmerian H9a R1b1a
Srubno-alakulskaya T2a1 R1a1a1?
Srubno-alakulskaya J1c3a R1a1a1
Srubno-alakulskaya H R1a1a1
Srubno-alakulskaya HV0a R1a1a1
Srubno-alakulskaya U5a1 R1a1a1
Srubno-alakulskaya HV0a R1a1a1
Late Sarmatian T1a1 R1a1a
Cimmerian C5c (50%) Q1a1

I’m assuming you aren’t surprised. These steppe tribes seem to be defined by extended paternal lineage networks. The Sbruna people are R1a1a1, as is dominant in Eastern Europe today. But, an ancient Sbruna male dating to 1800 BC was found to have the Asian variant of R1a1a1, found in South and Central Asia, not the one predominant among Slavic peoples.

Click to enlarge

Speaking of South Asians, there is some interesting discussion on this issue in the paper. I’ll quote a few sections:

The Bronze Age Srubnaya-Alakulskaya individuals from Kazburun 1/Muradym 8 presented genetic similarities to the previously published Srubnaya individuals. However, in f4 statistics, they shared more drift with representatives of the Andronovo and Afanasievo populations compared to the published Srubnaya individuals. Those apparently West Eurasian people lacked significant Siberian components (NEA and SEA) in ADMIXTURE analyses but carried traces of the SA component that could represent an earlier connection to ancient Bactria. The presence of an SA component (as well as finding of metals imported from Tien Shan Mountains in Muradym 8) could therefore reflect a connection to the complex networks of the nomadic transmigration patterns characteristic of seasonal steppe population movements….

There are two ways, not exclusive, that I can explain the “South Asian” component you find in some of the steppe individuals. First, the “South Asian” component is found in the Neolithic Iranian sample. And, you can see in another plot that the Scythians are enriched for West Asian ancestry in comparison to the Sbruna. As noted above there was probably south to north migration of these Indo-European nomadic groups. So yes, just as with the East Asia ancestry which periodically appears, this is evidence of an “Inner Asian International.”

A second possibility though is that the South Asian ancestry is artifactual and that it’s just emerging in ADMIXTURE because of shared ancestry between the Sbruna and South Asians because of gene flow from the steppe into South Asia (and since South Asians have “Iranian farmer” ancestry it also pops up in the Iranian Neolithich sample).

The Sbruna flourished between the 18th and 12th centuries BC. According to Wikipedia:

Philological and linguistic evidence indicates that the bulk of the Rigveda Samhita was composed in the northwestern region of the Indian subcontinent, most likely between c. 1500 and 1200 BC.

Mitannia Indo-Aryan is attested in Syria in 1380 BC.

In the centuries around 1500 BC it seems quite possible that there was a “Indo-Aryan Inner Asian International”, just as in the first millennium AD there emerged a Turkic International, and for more than a century after 1200 AD there was a Mongol International. In the north, the Indo-Aryans were absorbed by Iranian and Uralic peoples. In West Asia they didn’t have a major cultural impact, aside from introducing chariots. It is in India by happenstance that Indo-Aryan linguistic culture and aspects of their folk memory is preserved to this day.

This isn’t that amazing. Half of the speakers of Turkic langauges are ethnic Turks, who live in Turkey. Anatolia genetically isn’t really very East Asian, though there is some of that. But the cultural heritage of the ancient Turks remains stronger there than in areas anciently inhabited by Turks, such as western Mongolia (where the people are genetically more like the original Turks were in the first millennium AD).

What’s the upshot here? I think that there is a spectrum of passivity and xenophobia in the modes of production outlined above. Sedentary peasant peoples are the most conservative and xenophobic.  They are also the least warlike because their skill set is the least transferable to warfare. They specialize in production, not extortion.

Pure nomads are the least xenophobic and most open to various forms of cultural innovation. The Mongol horde rapidly expanded in the decades of Genghis Khan’s rule through assimilation of various Turkic and Tungusic peoples. Though Genghis Khan put his sons by his first wife Borte in all the major positions, competent individuals outside of his own family line were elevated to power and authority. We have enough evidence now that these social dynamics are also strongly driven by the reality of migrating males, who marry a variety of conquered peoples.

Though Mongols were religiously tolerant and relatively accepting of ethnic diversity so long as subordinate peoples did not rebel, they were fundamentally an extortive order where organized mass violence was always the weapon of first resort. They were almost certainly not atypical, but continuing an Inner Asian tradition which probably dates to the Bronze Age, and matured 1,000 years later with groups like the Scythians.

Agropastoralists, such as the people of Nothern Europe during antiquity, were probably somewhere in between peasants and nomads. Not as xenophobic as peasants, but definitely more inward looking than the steppe nomads.

September 27, 2018

Do the northern Chinese have Scythian ancestors?

Filed under: China,Historical Population Genetics,Scythian — Razib Khan @ 2:32 pm

There was some question regarding possible Scythian admixture into the early Zhou below. This is possible because of the Zhou dynasty, arguably the foundational one of Chinese imperial culture (the Shang would have been alien to Han dynasty Chinese, but the Zhou far less so), may have had interactions with Indo-European peoples to their north and west. This has historical precedent as the Tang dynasty emerged from the same milieu 1,500 years later, albeit the Tang were descended from a Turkic tribe, not Indo-Europeans.

I looked at some of my samples and divided the Han into a northern and southern cluster based on their position on a cline (removing the majority in between). I also added Lithuanians, Sardinians, Uyghurs, Mongols, and Yakut. As you can see on the PCA the Mongols are two clusters, so I divided them between Mongol and Mongol2.

Running ADMIXTURE after some outlier removal you see that the northern Han are distinct because they share ancestry from the Yakut modal cluster. In contrast, the Mongols and Uyghurs have ancestry from the Lithuanian modal cluster. Uygurs also have quite a bit of ancestry from the Druze modal cluster, which is West Asian. Also notice that the Mongol2 cluster, which shares more ancestry with the Yakuts also has more Lithuanian modal cluster ancestry. Two of the Mongol2 individuals are labeled as Khalkha.

Using some of the Sarmatian/Scythian samples from David Reich’s lab, I ran ADMIXTURE again. These ancient samples need to be interpreted with caution, as usual. But notice again that the northern Han obtain their minor ancestry from the Yakut. The Iron Age nomadic modal ancestry is found at low levels in the Mongols and Uygurs. I think this is a real effect. The presence of Alans with the hordes of the Mongol Empire is well attested, though the admixture is almost certainly earlier.

I ran some three population tests. This is what was notable.

  1. Han_N looks like it is mixed somewhat with Yakut
  2. Mongol has gene flow from Mongol2
  3. Mongol2 has gene flow from Lithuanians and Iron Age nomads

I literally spend an hour on this assembling the data. But I think the easiest conclusion to draw is that the “West Eurasian” shift in modern Chinese (north) is probably mediated through Turkic people.

September 26, 2018

Vietnamese are not that much like the Cambodians

Filed under: Cambodia,Historical Population Genetics,Vietnam — Razib Khan @ 11:45 pm

A comment below suggested another book on Vietnamese history, which I am endeavoring to read in the near future. The comment also brought up issues relating to the ethnogenesis of the Vietnamese people, their relationship to the Yue (or lack thereof) and the Khmer, and also the Han Chinese.

Obviously, I can’t speak to the details of linguistics and area studies history. But I can say a bit about genetics because over the years I’ve assembled a reasonable data set of Asians, both public and private. The 1000 Genomes collected Vietnamese from Ho Chi Minh City in the south. I compared them to a variety of populations using ADMIXTURE with 5 populations.

Click to enlarge

You can click to enlarge, but I can tell you that the Vietnamese samples vary less than the Cambodian ones, and resemble Dai more than the other populations. The Dai were sampled from southern Yunnan, in China, and historically were much more common in southern China, before their assimilation into the Han (as well as the migration of others to Southeast Asia).

Curiously, I have four non-Chinese samples from Thailand, and they look to be more like the Cambodians. This aligns well with historical and other genetic evidence the Thai identity emerged from the assimilation of Tai migrants into the Austro-Asiatic (Mon and Khmer) substrate.

Aside from a few Vietnamese who seem Chinese, or a few who are likely Khmer or of related peoples, the Vietnamese do seem to have some Khmer ancestry. Or something like that.

Narrowing the populations, and using Indians as an outgroup, I wanted to test the Vietnamese against a few select populations. In the graph to the right you see that they are on the same branch as the Dai, and there is gene flow from the Dia into the Cambodians, and from the Cambodians into the Vietnamese. These results actually suggest that the Cambodians have had more gene flow in than the Vietnamese.

If you check the ADMIXTURE plot though you notice that there is a huge range of variation in the Cambodians in terms of their ancestry. The Mon kingdoms to the west of Cambodia fell to the Tai, but Cambodia itself did not. It probably absorbed a fair amount of Tai ancestry though, even if it retained its cultural distinctiveness and character.

A PCA shows that the Vietnamese are a distinct cluster. Different from both the Dai and South Chinese. Some of the samples in the 1000 Genomes are shifted toward the Cambodians and others toward the Chinese.

Finally, I ran a three population test. Here are some results of interest:

o3 pop1 pop2 f3 z
Cambodia Dai Indian -0.00175342 -25.8023
Cambodia French_Basque Dai -0.00192501 -22.1918
Cambodia Vietnamese Indian -0.00122671 -20.5523
Cambodia French_Basque Vietnamese -0.00136869 -17.6703
Cambodia Dai Papuan -0.0013018 -12.7299
Cambodia Han_S Indian -0.000790546 -10.365
Cambodia Vietnamese Papuan -0.000929681 -9.57058
Cambodia French_Basque Han_S -0.00087403 -9.24743
Cambodia Han_S Papuan -0.000476145 -4.05509
Dai Han_S Cambodia -0.000106184 -4.15877
Dai Cambodia She -0.000123515 -3.04445
Han_N French_Basque Han_S -0.000690947 -6.04291
Han_N Han_S Indian -0.000379328 -3.60634
Han_S Dai Han_N -0.000562373 -20.0654
Han_S Vietnamese Han_N -0.000425554 -15.6301
Han_S Filipino Han_N -0.000560061 -14.4192
Han_S Filipino Naxi -0.000529454 -10.9605
Han_S Malay Han_N -0.00038395 -10.3834
Han_S Dai Naxi -0.000316766 -9.36127
Han_S Filipino Yizu -0.000377863 -7.59642
Han_S Dai Yizu -0.000271844 -7.57112
Han_S Cambodia Han_N -0.000272892 -6.90769
Han_S Vietnamese Naxi -0.000211726 -6.09433
Han_S Vietnamese Yizu -0.000178654 -5.79285
Han_S Filipino Tujia -0.000175578 -4.66665
Han_S Thailand Han_N -0.000270477 -4.17533
Han_S Vietnamese Tujia -9.7422E-05 -3.79926
Han_S Tujia Dai -8.98028E-05 -3.0287
Han_S Tujia Malay -6.18931E-05 -1.67189
Han_S She Han_N -7.74747E-05 -1.41452
Han_S Filipino She -3.55034E-05 -0.888484
Vietnamese Han_S Cambodia -0.000646757 -34.4357
Vietnamese Han_S Malay -0.000420205 -22.545
Vietnamese Cambodia She -0.000615643 -17.2252
Vietnamese Tujia Cambodia -0.000553747 -15.6249
Vietnamese Malay She -0.000460983 -13.9445
Vietnamese Tujia Malay -0.000384676 -12.4208
Vietnamese Dai Indian -0.000494414 -12.4142
Vietnamese Cambodia Han_N -0.000494095 -12.2197
Vietnamese Miaozu Cambodia -0.000421982 -11.4913
Vietnamese Malay Han_N -0.000378602 -10.154
Vietnamese French_Basque Dai -0.000524036 -9.99871
Vietnamese Miaozu Malay -0.000280205 -8.27434
Vietnamese Dai Papuan -0.000339828 -5.83617
Vietnamese Han_S Indian -0.000210588 -4.70338
Vietnamese Dai Han_N -0.000122813 -4.42234
Vietnamese Malay Naxi -0.000152052 -3.8678
Vietnamese Han_S Thailand -0.000147552 -3.73211
Vietnamese Cambodia Yizu -0.000145687 -3.71074
Vietnamese Cambodia Naxi -0.000133426 -3.20226
Vietnamese Burm Dai -5.79109E-05 -3.12906
Vietnamese Dai Yizu -7.91838E-05 -3.00809

September 13, 2018

Avars across a sea of grass

Filed under: Avars,Historical Population Genetics — Razib Khan @ 10:29 pm

That sound you hear is the rumbling of the earth caused by the rippling tsunami that’s coming. The swell of ancient DNA papers focused on historical, rather than prehistorical, time periods. Some historians are cheering. Some are fearful. Others know not what to think. It will be. The illiterate barbarians of yore shall come out of the shadows.

If they had arrived on the edge of Europe two centuries earlier, the Avars would have a reputation as fearsome with the Huns, with whom they are often confused, and rightly so. But the Avars emerged as a force on the European landscape after the end of the West Roman Empire. The post-Roman polities did not have their own Ammianus Marcellinus (sorry Bede, you lived in the middle of nowhere).

And yet for centuries the Avars dominated east-central Europe and held the numerous Slavic tribes in thrall. They smashed past the borders of Byzantium during the reign of the heir of Justinian, and by 600 AD, on the eve of the great battle with Persia Constantinople had lost control of most of its Balkan hinterlands to these barbarians. A Byzantium which still controlled North Africa, much of Italy, southern Spain, Egypt, Anatolia, and the Levant, had been reduced to strongpoints all around the Balkan littoral. During the wars with the Sassanids, the Avars took advantage of the opportunity offered, and even raided the suburbs of Constantinople itself!

So who were these people? The most plausible conjecture is that they were part of the great mass mobilization of Turkic peoples which began in the early centuries of the first millennium after Christ. As Rome and Han China fell, nomadic barbarians rose. A new preprint seems to all but confirms this, Inner Asian maternal genetic origin of the Avar period nomadic elite in the 7th century AD Carpathian Basin:

After 568 AD the nomadic Avars settled in the Carpathian Basin and founded their empire, which was an important force in Central Europe until the beginning of the 9th century AD. The Avar elite was probably of Inner Asian origin; its identification with the Rourans (who ruled the region of today’s Mongolia and North China in the 4th-6th centuries AD) is widely accepted in the historical research. Here, we study the whole mitochondrial genomes of twenty-three 7th century and two 8th century AD individuals from a well-characterised Avar elite group of burials excavated in Hungary. Most of them were buried with high value prestige artefacts and their skulls showed Mongoloid morphological traits. The majority (64%) of the studied samples’ mitochondrial DNA variability belongs to Asian haplogroups (C, D, F, M, R, Y and Z). This Avar elite group shows affinities to several ancient and modern Inner Asian populations. The genetic results verify the historical thesis on the Inner Asian origin of the Avar elite, as not only a military retinue consisting of armed men, but an endogamous group of families migrated. This correlates well with records on historical nomadic societies where maternal lineages were as important as paternal descent.

The samples were from a period about a century after the arrival of the Avars. It is not unreasonable to think that the Avar conquest meant that a continuous stream of Inner Asian pastoralists kept entering into the territory which they occupied for the opportunity, but this sort of genetic distinctiveness indicates that the Avars remained very separate from the people from whom they extracted tribute. Most, though not all, of these people, were or became Slavs.

Around 800 AD the Avars were finally defeated decisively by the Franks, and their elite converted to Christianity. I suspect this was the final step which would result in their assimilation over the next few centuries into the location population until they diminished and disappeared.

The results above support the proposition that the Pannonian Avars of the second half of the 6th century were the descendants of the Rouran Khaganate of the early half 6th century. The kicker is that the Rouran flourished in Mongolia! So like the Mongols six hundred years later, the Avars seem to have swept across the entire length of Eurasia that was accessible to their horses in a generation. To some extent, this is a recapitulation of the pattern we see nearly 3,000 years before the Avar, when the Afanasievo culture established itself in the Altai region, far from its clear point of origin in the forest-steppe of Eastern Europe.

Perhaps the period between 500 BC and 300 AD can be seen as an ephemeral transient between the vast periods before and after when pastoralists had free reign across most of temperate Eurasia?

September 12, 2018

The genetics of Afrikaners (again)

Filed under: Afrikaner genetics,Historical Population Genetics — Razib Khan @ 10:18 pm
Click to enlarge


I personally get asked about the genetics of Afrikaners, because I’ve written about/analyzed the issue before. The main outlines seem to be established, but I thought I might go and revisit it again. The main reason is that we have ancient South African DNA, and I’ve been adding it to my personal analyses for a while. It might be worthwhile to reanalyze the South Africa samples I do have with some of these added in.

The plot at the top shows the core populations I started with. I did some outlier pruning. I only kept the South African samples that were overwhelmingly white. I picked Malays and a South Indian population because of Cape Coloureds, a mixed-race Afrikaans speaking group which has Asian ancestry that can be attributed to both South and Southeast Asian populations (the Dutch imported many slaves from India and had outposts in Java). I also used Bantu samples from South Africa, Kenya, as well as a Nigeria population. Finally, I also had some Hadza as a different hunter-gatherer population than the San Bushmen. For Europeans, I used white Dutch.

The final marker density as 200,000 SNPs, so not too bad.

As you can see if you click on the image all of the South African whites were shifted away from the Dutch. There were two outlier individuals, one of which was closer to the Dutch cluster, and one further. All the other individuals form a neat cluster. None of these individuals were close relatives.

Click to enlarge

I ran Treemix on the data with multiple migrations until the migrations stopped making sense to me. The African populations’ exhibit migration flows to each other. Much of it is entirely comprehensible. The Esan receive no migration, highlighting that this population did not receive gene flow from any groups in these data. The Kenya Bantus receive gene flow from the direction of Eurasians. This is also certainly Nilotic mediated. The gene flow they receive from the base of the ancient San is more enigmatic, but probably reflects uptake of local ancestry as the Bantus expanded. The southern Bantus receive gene flow from modern San.

The South African whites receive gene flow from a position on the graph between the modern San and other non-San African groups.

Click to enlarge

Next, I ran Admixture in the unsupervised mode with K = 6. The two populations mostly light-blue are South African whites and the Dutch, from the top to the bottom. You can see though that the South African whites clearly have other ancestral components. Most of these individuals have the components modal in the San, Esan Nigerians, Indians, and Malays. The two outlier individuals are also clear. The individual very close to the Dutch, but shifted toward the Asians, in the PCA does not have any African admixture. The individual shifted more toward the non-Europeans in the PCA also has more non-European fractions of ancestral components (that is, those components modal in non-European populations).

Next, I decided to confirm things by running a three population test. If you read this blog you’ve seen this before. Basically this is measuring shared ancestry by looking at deviations from a particular phylogenetic model: (test population(pop 1, pop2)). The relatedness of the test population to either pop1 or pop2 (that is, it’s a mix of the two) is measured by the negative f3 statistic, and I focused on z-scores greater than two.

Here they are:

Outgroup Pop1 Pop2 f3 z
Bantu_NE EsanNigeria Dutch -0.0009 -6.54
Bantu_NE EsanNigeria South_Africa_White -0.0010 -6.54
Bantu_NE EsanNigeria Malay -0.0009 -6.33
Bantu_NE EsanNigeria Telegu -0.0008 -6.00
Bantu_NE Bantu_S South_Africa_White -0.0008 -4.84
Bantu_NE Bantu_S Dutch -0.0008 -4.77
Bantu_NE Bantu_S Malay -0.0007 -4.21
Bantu_NE Bantu_S Telegu -0.0007 -4.05
Bantu_NE Dutch San_Ancient -0.0009 -3.02
Bantu_NE Hadza EsanNigeria -0.0004 -2.97
Bantu_NE Telegu San_Ancient -0.0007 -2.32
Bantu_NE Malay San_Ancient -0.0007 -2.04
Bantu_S EsanNigeria San_Modern -0.0028 -21.62
Bantu_S EsanNigeria San_Ancient -0.0039 -20.78
Bantu_S San_Ancient Bantu_NE -0.0030 -12.91
Bantu_S San_Modern Bantu_NE -0.0019 -12.45
Bantu_S Dutch San_Ancient -0.0031 -10.63
Bantu_S Telegu San_Ancient -0.0030 -10.33
Bantu_S San_Ancient South_Africa_White -0.0027 -9.17
Bantu_S Malay San_Ancient -0.0029 -8.97
San_Modern Dutch San_Ancient -0.0091 -34.96
San_Modern Telegu San_Ancient -0.0087 -33.86
San_Modern San_Ancient South_Africa_White -0.0089 -33.54
San_Modern San_Ancient Bantu_NE -0.0063 -31.93
San_Modern Malay San_Ancient -0.0085 -30.98
San_Modern Bantu_S San_Ancient -0.0052 -28.91
San_Modern Hadza San_Ancient -0.0051 -27.58
South_Africa_White Dutch Bantu_NE -0.0017 -12.96
South_Africa_White EsanNigeria Dutch -0.0017 -12.68
South_Africa_White San_Modern Dutch -0.0018 -12.41
South_Africa_White Bantu_S Dutch -0.0017 -12.36
South_Africa_White Dutch San_Ancient -0.0021 -12.14
South_Africa_White Hadza Dutch -0.0014 -10.41
South_Africa_White Malay Dutch -0.0007 -5.97
South_Africa_White Telegu Dutch -0.0003 -3.64
Telegu Malay Dutch -0.0004 -2.79


No surprises so far. One thing that did surprise me though was the extent of the admixture even after PCA outlier removal. So I took the output you saw above and removed individuals that were very mixed, except for the case of the white South Africans. Then, I ran admixture in supervised mode, where the “pure” populations were fixed as references (I merged the moden San without much admixture with the ancient San). You can see the results below:

Click to enlarge

Re-running the three population test with these “pure” populations I only got significant results for the below cases:

Outgroup Pop1 Pop2 f3 z
South_Africa_White Dutch EsanNigeria -0.0017 -13.1937
South_Africa_White San Dutch -0.0020 -12.6910
South_Africa_White Hadza Dutch -0.0014 -9.7246
South_Africa_White Malay Dutch -0.0009 -6.6481
South_Africa_White Telegu Dutch -0.0004 -4.6167

No big surprise.

The average European ancestry I got in my South African white samples, N = 12, is 93.5%. Making a composition individual, note that if someone had great-great-grandparents who were not European, they would be expected to have 6.25% non-European ancestry. That’s 4 generations back. So about 100 years. These individuals are presumably adults. Let’s say they are 25 years old. That goes back 125 years. It’s probably reasonable in a single person admixture people to suggest it was sometime in the mid to late 19th century.

This seems unlikely. The evenness of admixture and balance between different groups indicates that it is older than that, and they are obtaining it from different lineages. Traditional genealogical estimates suggested in the range of 5-7.5% non-European ancestry in Afrikaners, and one study of 185 individuals showed 18% non-European mtDNA.

I will probably some ancestry deconvolution and see if I can get a figure for the time of admixture (though the fractions here are very small, as is the sample size of the admixtured population). But the non-European ancestry of Afrikaners is uncannily similar to the non-European ancestry of the Cape Coloureds. That to me leads us to the conclusion that in the early European settler community a fair number of mixed-race women married in. Those mixed-race women who married mixed-race men helped found the Cape Coloureds.

August 12, 2018

Live not by the haplogroup alone

Filed under: Historical Population Genetics,Y chromosomes — Razib Khan @ 11:21 pm

In The population genomics of archaeological transition in west Iberia the authors note that “the population of Euskera speakers shows one of the maximal frequencies (87.1%) for the Y-chromosome variant, R1b-M269…” In the early 2000s the high frequency of R1b-M269 among the Basques, a non-Indo-European linguistic isolate, was taken to be suggestive of the possibility that R1b-M269 reflected ancestry from European hunter-gatherers present when farmers and pastoralists pushed into the continent.

The paper above shows that the reality is that the Basque people have higher fractions of Neolithic farmer ancestry than any other Iberian people. Additionally, they have lower fractions of the steppe pastoralist ancestry than other Iberian groups. This, despite the fact that we also know from ancient DNA that R1b-M269 does seem to have spread with steppe pastoralists, likely Indo-Europeans.

Obviously the relationship between Y chromosomes and genome-wide ancestry is complex. The pattern here for the indicates that Indo-European male lineages were assimilated into the Basques. Perhaps the Basque were matrilineal? One can’t know. But, these men did not impose their culture. Instead, they were assimilated into the Basque. This is entirely not shocking. There history of contact between different peoples in the recent past shows plenty of cases where individuals have “gone native.” In some cases, many individuals.

I was thinking this when looking at South Asian Y chromosome frequencies. Though R1a1a is correlated with higher castes and Indo-European speakers, its frequency is quite high in some ASI-enriched groups. I suspect that the period after 2000 BC down to the Common Era witness a dynamic where particular patrilineal societies were quite successful in maintain their status over generations. Additionally, the ethnogenesis of “Indo-Aryan” and “Dravidian” India was occurring over this period, in some cases through a process of expansion, integration, and conflict. It seems some pre-Aryan paternal lineages were assimilated into Brahmin communities. For example, Y haplogroup R2, whose origin is almost certainly in the Indus Valley Civilization society.

Some population genetic models are stylized and elegant. They have to be to be tractable. But we always need to remember that real history and prehistory were complex, and exhibited a richer and more chaotic texture.

June 19, 2018

Why the Y chromosome is coming back

Filed under: Historical Population Genetics,The Insight,Y chromosome — Razib Khan @ 6:43 am

Last week Spencer and I talked about chromosomes and their sociological import on The Insight. It was a pretty popular episode, but then again, my post on the genetics of Genghis Khan is literally my most popular piece of writing of all time which wasn’t distributed in a non-blog channel (hundreds of thousands of people have read it). Thanks to everyone who left a review on iTunes and Stitcher (well, a good review). We’re getting close to my goal of 100 reviews on iTunes and 10 on Stitcher so that I won’t pester you about it.

Of course the reality is that the heyday of  chromosomal population genetic studies was arguably about 15 years ago, when Spencer wrote The Journey of Man. I have personally constructed Y phylogenies before…but as you know from reading this weblog, I tend to look at genome-wide autosomal studies. There is a reason that why Who We Are and How We Got Here focuses on autosomal data.

All that being said, Y (and mtDNA) still have an important role to play in understanding the past: sociological dynamics. The podcast was mostly focused on star phylogenies, whether it be the Genghis Khan haplotype, or the dominant lineages of R1a and R1b. Strong reproductive skew does have genome-wide effects, but unless it’s polygyny as extreme as an elephant seal’s those effects are going to be more subtle than what you see in the Y and mtDNA.

Submitted for your approval, two recent preprints on bioRxiv: The role of matrilineality in shaping patterns of Y chromosome and mtDNA sequence variation in southwestern Angola and Cultural Innovations influence patterns of genetic diversity in Northwestern Amazonia. The future is going to be in understanding sexual dynamics and culture.

May 16, 2018

Migration at the roof of West Asia

Filed under: Historical Population Genetics,History,Indo-Europeans — Razib Khan @ 10:16 pm
Click to see the full figure

The figure to the left is from The genetic prehistory of the Greater Caucasus. If you are a regular reader of this weblog, or Eurogenes, you can figure out what’s going on, and keep track of the terminology. But in 2018 I think we’re getting to the end of the line in making sense of “admixture graphs” in relation to West Eurasian population structure. The models are just getting too complicated to keep everything straight, and the distinct-populations-subject-to-pulse-admixture seems to be an assumption that may not necessarily hold.

To get a sense of what I’m talking about, the above preprint focuses on populations in and around the Caucasus region. One of the major reasons that this is important is that the Caucasus was and is to some extent a continental hinge, connecting Eastern Europe and the Pontic steppe, to the Near East. The Arab Muslims pushed north of the Caucasus, and came into conflict with the Khazars, while Cimmerians and Scythians moved south from the Pontic steppe.

The elephant in the room is the relevance to the “Indo-European controversy.” Colin Renfrew long ago posited that the Indo-European languages derive from West Asian farmers who expanded into Europe as early as ~9,000 years ago. A rival theory is that Indo-Europeans spread out of the Pontic steppe ~4,000 years ago. In 2015 two major papers suggested that the steppe was a major source of Indo-European expansion. Case closed? This preprint suggests perhaps not.

But we’ll get to that later. What do the results here show? The prose is a little hard to tease apart, but the major issues seem to be that in antiquity, or at least the period they’re focusing on, much of the gene flow seems to have been south (Near East) to the north (through the Caucasus, and out to the north slope). To some extent, we already knew this: the Yamna people of the Pontic steppe have “southern” ancestry from the Near East that earlier East European/Pontic people do not. In this preprint, the authors show that groups such as the Maykop of the north slope of the Caucasus carry Y haplogroups such as G2, and not the R1 lineages commonly found in the steppe. David W. suggests that this confirms that Near Eastern gene flow into the steppe was female-mediated.  This is plausible, but I would caution that Y chromosomes alone can be deceptive, due to the power of particular patrilineages. We’ll probably rely on the X chromosome to make a final judgment.

The plot below shows many of the relationships as a function of location and time. The green component is modal among “Iranian farmers,” the orange among “Anatolian farmers,” and the blue among “Western hunter-gatherers.”

A major aspect of this preprint is that it has to work hard to differentiate two Anatolian farmer-like signals: the first, from Anatolian farmers proper, and the second from the descendants of European farmers, who themselves are a mix of Anatolian farmers with a minority ancestry among the hunter-gatherers. The answers would probably be totally unintelligible if not for archaeology. It’s clear that the steppe people had contact with both European and Near Eastern farmers and that later East European groups that succeeded the Yamna were subject to reflux from Central Europe, and received European farmer ancestry.

Another curious nugget in their results is that there was early detection of both Ancestral North Eurasian (ANE) ancestry and, some East Eurasian gene flow (related to Han Chinese). One of their individuals carries the East Eurasian variant of EDAR, which today is only found in Finns, though it was found in reasonable frequencies among the Motala hunter-gatherers of Scandinavia. Additionally, Fu et al. 2016 found that the ancestors of Mesolithic hunter-gatherers received some gene flow from Eastern Eurasians as well (also in the supplements of Lazaridis et al. 2016).

The authors admit that there is probably population structure among ANE and undiscovered groups of East Eurasians who were traversing the Inner Asian landscape. I think this is all suggestive of some long-distance contacts, though the intensity and magnitude increased a lot with high-density societies and the mobility of pastoralism.

Much of the genetic mixing in the Near East, and to some extent in the trans-Caucasian region, seems to date to the 4th millennium. This is technically prehistory, but it is also the Uruk period. This was a phase of Mesopotamian culture expansion between 4000 and 3100 BC which resulted in replicas of Uruk style settlements as far away as Syria and southeastern Anatolia. There is even evidence of Uruk-related migration to the North Caucasus.

The Uruk experienced abrupt and sudden collapse. Uruk settlements outside of the core zone of Mesopatamia disappear.

It’s the final paragraph that warrants discussion:

The insight that the Caucasus mountains served not only as a corridor for the spread of CHG/Neolithic Iranian ancestry but also for later gene-flow from the south also has a bearing on the postulated homelands of Proto-Indo-European (PIE) languages and documented gene-flows that could have carried a consecutive spread of both across West Eurasia…Perceiving the Caucasus as an occasional bridge rather than a strict border during the Eneolithic and Bronze Age opens up the possibility of a homeland of PIE south of the Caucasus, which itself provides a parsimonious explanation for an early branching off of Anatolian languages. Geographically this would also work for Armenian and Greek, for which genetic data also supports an eastern influence from Anatolia or the southern Caucasus. A potential offshoot of the Indo-Iranian branch to the east is possible, but the latest ancient DNA results from South Asia also lend weight to an LMBA spread via the steppe belt…The spread of some or all of the proto-Indo-European branches would have been possible via the North Caucasus and Pontic region and from there, along with pastoralist expansions, to the heart of Europe. This scenario finds support from the well attested and now widely documented ‘steppe ancestry’ in European populations, the postulate of increasingly patrilinear societies in the wake of these expansions (exemplified by R1a/R1b), as attested in the latest study on the Bell Beaker phenomenon….

But instead of tackling this let’s focus on the paper that came out of the Willerslev group, The first horse herders and the impact of early Bronze Age steppe expansions into Asia. This is a final manuscript in Science. That means it was probably written before The Genomic Formation of South and Central Asia. When it comes to South Asia, the results from the two publications are consanant. There is no conflict.*

More interesting are the results in West Asia, and the linguistic supplement. In the authors note that tablets now indicate an Indo-Aryan presence in Syria ~1750 BC. Second, Assyrian merchants record Indo-European Hittite, or Nesili (the people of Nesa), as early as ~2500 BC.

As suggested in earlier work Hittite remains don’t suggest steppe influence. David W. says:

The apparent lack of steppe ancestry in five Hittite-era, perhaps Indo-European-speaking, Anatolians was interpreted in Damagaard et al. 2018 as a major discovery with profound implications for the origin of the Anatolian branch of Indo-European languages.

But I disagree with this assessment, simply because none of these Hittite-era individuals are from royal Hittite, or Nes, burials. Hence, there’s a very good chance that they were Hattians, who were not of Indo-European origin, even if they spoke the Indo-European Hittite language because it was imposed on them.

The main aspect I’d bring up with this is that in other areas steppe ancestry has spread deeply and widely into the population, including non-Indo-European ones. It is certainly possible that the sample is not needed enough to pick up the genuinely Hittite elite, but I probably lean to the likelihood that the steppe signal won’t be found. It seems that the Anatolian languages were already diversified by ~2000 BC, and perhaps earlier. Linguists have long suggested that they are the outgroup to other Indo-European languages, though this could just be a function of their isolation among highly settled and socially complex populations.

Two alternative models present themselves for these results. The Anatolian Indo-European languages expanded through elite diffusion,  part of the same general migrations that emerged out of the Yamna culture ~3000 BC. The lack of a steppe signal may be due to sampling bias, as David W. suggested, or, more likely in my opinion, simple dilution of the signal. Second, the steppe migrations were one part of a broader palette of population movements and cultural diffusions, and the Anatolian Indo-Europeans are basal to the efflorescence of the steppe derived branches.

The evidence of the explosion of Indo-Aryans in the years after 2000 BC in West and South Asia, as well as the expansion of Iranians across vast swaths of Inner Asia during the same period, suggest to me that Indo-Iranians are most definitely part of the steppe pulse. The connection to the Sintashta charioteers presents itself, and, connections to the Uralic languages indicates incubation in the trans-Volga region.

In West Asia, the Indo-Aryans crashed themselves against the most advanced civilizations of their time. Like the Bulgars, and unlike the Hittites, Indo-Aryan Mitanni was totally absorbed by their non-Indo-European Hurrian substrate. Indo-Aryan linguistic influence was preserved in their names, their gods, and in particular words relating to chariots. And yet in 2017’s Continuity and Admixture in the Last Five Millennia of Levantine History from Ancient Canaanite and Present-Day Lebanese Genome Sequences, the authors observe:

We next tested a model of the present-day Lebanese as a mixture of Sidon_BA and any other ancient Eurasian population using qpAdm. We found that the Lebanese can be best modeled as Sidon_BA 93% ± 1.6% and a Steppe Bronze Age population 7% ± 1.6% (Figure 3C; Table S6). To estimate the time when the Steppe ancestry penetrated the Levant, we used, as above, LD-based inference and set the Lebanese as admixed test population with Natufians, Levant_N, Sidon_BA, Steppe_EMBA, and Steppe_MLBA as reference populations. We found support (p = 0.00017) for a mixture between Sidon_BA and Steppe_EMBA which has occurred around 2,950 ± 790 ya (Figure S13B).

This needs to be more explored. The admixture could have come from many sources. I am curious about the frequency of R1a1a-z93 among modern-day Syrians and Lebanese.

For me these arguments can only be resolved with a deeper understanding of linguistic evolution. The close relationship of Indo-Aryan and Iranian languages is obvious to any speaker of either of these languages (I can speak some Bengali). A divergence in the range of 4 to 5 thousand years before the present seems most likely to me. But the relationship of the other Indo-European languages is much less clear.

One of the arguments in Peter Bellwood’s First Farmers is that the Indo-European languages exhibit a “rake-like” topology with the exception of Indo-Iranian, which forms a clear clade. To him and others in his camp, this argues for deep divergences very early in time.

It is hard to deny that the steppe migrations between 4 and 5 thousand years ago had something to do with the distribution of modern Indo-European languages. But, it is harder to falsify the model that there were earlier Indo-European migrations, perhaps out of the Near East, that preceded these. Only a deeper understanding of linguistic evolution, and multidisciplinary analysis of regional substrates will generate the clarity we need.

* I’m going to skip the Botai angle in this post.

April 30, 2018

Is American genetic diversity enough?

Filed under: Historical Population Genetics,Human Genetic Variation — Razib Khan @ 8:51 pm

In the nearly 20 years since the draft of the human genome was complete,* we’ve moved on to bigger and better things. In particular, researchers are looking to diversify their panels of human genetic diversity, because of differences between groups matter. You can’t just substitute them for each other genetically.

There have been efforts to diversify the population panels recently, but that prompts the question whether American population coverage is sufficient. My first thought is that the genetic diversity in the USA is probably getting us 90% of the way there. Consider Spencer’s comment about Queens, it’s the most ethnically diverse large conurbation in the country.

There are some gaps though. In Who We Are David Reich points out the distinctiveness of Indian population genetics. The subcontinent has lots of large census populations which have drifted upward deleterious alleles due to long-term endogamy. And, many of these populations don’t have a strong representation in the Diaspora.

In contrast, much of the rest of the world is panmictic enough that an American panel can pick up most of the variation. American Chinese are skewed toward Guandong and Fujian, but a substantial number of people from other parts of China have arrived in the last generation. Regional structure is not so strong that you’ll miss out on too much, aside from very rare variants which are more extended pedigree scale rather than population scale.

There are small populations such as Hadza, Khoikhoi, and Pygmies in Africa which are probably going to be missed by American population panels, but the total census size of these groups is pretty low (for comparison, there are 1 million Pulayar Dalits in the state of Kerala alone). Much of the rest of Africa is West African variation well represented in African Americans, and Bantu and Nilotic variation probably captured my immigrant communities.

I’d propose supplementing American genetic diversity with sampling Cape Coloureds in South Africa.

* No discussions about how the genome isn’t totally complete. I know that.

February 24, 2018

Are Turks Armenians under the hood?

Filed under: Historical Population Genetics,Population genetics — Razib Khan @ 8:31 pm

Benedict Anderson’s Imagined Communities: Reflections on the Origin and Spread of Nationalism is one of those books I haven’t read, but should. In contrast, I have read Azar Gat’s Nations, which is a book-length counterpoint to Imagined Communities. To take a stylized and extreme caricature, Imagined Communities posits nations to be recent social and historical constructions, while Nations sees them as primordial, and at least originally founded on on ties of kinships and blood.

The above doesn’t capture the subtlety of  Gat’s book, and I’m pretty sure it doesn’t capture that of Anderson’s either. But, those are the caricatures that people take away and project in public, especially Anderson’s (since Gat’s is not as famous).

When it comes to “imagined communities” I recently have been thinking how much that of modern Turks fits into the framework well. Though forms of pan-Turkic nationalism can be found as earlier as 9th-century Baghdad, the ideology truly emerges in force in the late 19th century, concomitantly with the development of a Turkish identity in Anatolia which is distinct from the Ottoman one.

The curious thing is that though Turkic and Turkish identity is fundamentally one of language and secondarily of religion (the vast majority of Turkic peoples are Muslim, and there are periods, such as the 17th century when the vast majority of Muslims lived in polities ruled by people of Turkic origin*), there are some attempts to engage in biologism. This despite the fact that the physical dissimilarity of Turks from Turkey and groups like the Kirghiz and Yakut is manifestly clear.

Several years ago this was made manifestly clear in the paper The Genetic Legacy of the Expansion of Turkic-Speaking Nomads across Eurasia. This paper clearly shows that Turkic peoples across Eurasia have been impacted by the local genetic substrate. In plainer language, the people of modern-day Turkey mostly resemble the people who lived in Turkey before the battle of Manzikert and the migration of Turkic nomads into the interior of the peninsula in the 11th century A.D. Of course, there is some genetic element which shows that there was a migration of an East Asian people into modern day Anatolia, but this component in the minority one.**

Sometimes the Turkish fascination with the biological comes out in strange ways, Turkish genealogy database fascinates, frightens Turks. Much of the discussion has to do with prejudice against Armenians and Jews. But the reality is that most Turks at some level do understand that they are descended from Greeks, Armenians, Georgians, etc.

To interrogate this further I decided to look at a data set of Greeks, Turks, Armenians, Georgians, and a few other groups, including Yakuts, who are the most northeastern of Turkic peoples. The SNP panel was >200,000, and I did some outlier pruning. Additionally, I didn’t have provenance on a lot of the Greeks, except some labeled as from Thessaly. I therefore just split those up with “1” being closest to the Thessaly sample and “3” the farthest.

First, let’s look at the PCA.

The Turks are shifted toward the Yakuts, but not too much. In contrast, there is much more of Yakut shift in Tajiks, and especially Turkmens. These are two groups from further east, closer to the heart of the zone Turkic expansion. Curiously, the Tajiks, who are the dominant non-Turkic Iranian speaking people of Central Asia, actually have more East Asian ancestry than the Turks of Turkey. This goes to show that ethnicity is somewhat fluid, and Turkic people have assimilated into the Tajik identity. That being said, please note that the Turkmen are notably more east-shifted than the Tajik.

Let’s see how this looks on pairwise Fst.

Fst is kind of difficult for fine distinctions when you have outgroups like Yakuts and Dai. So let’s look at Treemix with five migrations:

On this, you can see that the relationship of the Greece clusters on Treemix to Lithuanians matches PCA. Greece1 is the closest, Greece 3 the farthest.

The Turks are close to the Georgians and Armenians, but not the Kurds, or Tajiks. And, they receive gene flow from the Turkmen-Yakut region of the graph. So do the Tajiks…but the Tajiks also remove gene flow from the Lithuanians. The admixture plot makes it more clear what’s happening I think.

Yellow ~ modal in Southern Europe, green ~ modal Northern Europe, red ~ Central Asian, while blue and purple are northern and southern East Asian. In comparison to Turks of Anatolia Tajiks have a lot more Northern European affinity, probably because of the common steppe heritage. Not surprisingly, Turks have more Southern European like ancestry.

Curiously the East Asian ancestry in the Turkic people seems to be both Yakut and Dai like, so perhaps it was more cosmopolitan than we might think? The Yakuts after all are from the northern edge of the range, and may have absorbed a lot of indigenous Siberian ancestry.

Georgians have none of the Northern European sort of ancestry, but Armenians do, and Turks even more. One could posit that this is due to Slavic ancestry arriving with the Rumelian Turks who arrived in the 20th century, but just as likely is the possibility that Turks have a lot of ancestry from western Anatolia which was Greek, and Greeks have more of this than Armenians.

It’s hard to tell from these results whether Turks have more of an affinity with Greek or Armenians as their non-Turkic ancestors. So I ran a three population test.

Outgroup X1 X2 f3 error z
Turkey Armenians Yakut -0.00253688 6.70852e-05 -37.8158
Turkey Greece3 Yakut -0.00246931 6.72384e-05 -36.7247
Turkey Georgian Yakut -0.00256555 7.60158e-05 -33.7502
Turkey Armenians Dai -0.00246779 7.40038e-05 -33.3468
Turkey Greece3 Dai -0.0024101 7.34629e-05 -32.8071
Turkey Georgian Dai -0.00249174 8.11957e-05 -30.688
Turkey Greece2 Yakut -0.00222382 7.62368e-05 -29.1699
Turkey Greece2 Dai -0.00231001 8.39207e-05 -27.5261
Turkmen Turkey Dai -0.00288213 0.000108049 -26.6742
Turkmen Turkey Yakut -0.00254805 0.000102816 -24.7826
Turkey Greece1 Yakut -0.00225638 9.94722e-05 -22.6836
Turkey GreekCentral Dai -0.00235681 0.000104014 -22.6587
Turkey Greece3 Tajik -0.000622671 2.76666e-05 -22.5063
Turkey GreekCentral Yakut -0.00221985 0.000101654 -21.8373
Turkey Greece1 Dai -0.00243254 0.000112011 -21.717
Turkey Greece3 Turkmen -0.000640439 3.33529e-05 -19.2019
Turkey GreekThessaly Yakut -0.00208436 0.00011042 -18.8767
Turkey Dai GreekThessaly -0.00225435 0.00012241 -18.4163
Turkey Greece2 Turkmen -0.000584983 3.29819e-05 -17.7365
Turkey Armenians Turkmen -0.000520887 3.07253e-05 -16.953
Turkey Armenians Tajik -0.000421139 2.55274e-05 -16.4975
Tajik Turkey Dai -0.00140423 8.51697e-05 -16.4875
Tajik Turkey Yakut -0.00124601 7.60725e-05 -16.3793
Turkey Georgian Turkmen -0.000532496 3.80694e-05 -13.9875
Turkey Greece2 Tajik -0.000412419 3.04172e-05 -13.5587
Turkey Armenians Lithuanians -0.000459831 3.75838e-05 -12.2348
Turkey Greece1 Turkmen -0.000570715 4.7753e-05 -11.9514
Turkey Kurds Yakut -0.00146087 0.000124799 -11.7058
Turkey GreekThessaly Turkmen -0.000516877 4.46683e-05 -11.5714
Turkey Georgian Tajik -0.000328859 3.02443e-05 -10.8734
Turkey GreekCentral Turkmen -0.000504962 4.92555e-05 -10.2519

Armenians beat out Greece3 a bit better, but really it’s hard to say from this that this is definitive. It’s likely that my Turkish sample has both, and/or the original Turkic nomads had Iranian-like ancestry which was more like Armenian than Greek? Hard to say. Additionally, the face that Greece3 is better than the other options suggests to me that the source are Anatolian Greeks who were less impacted by migrations from the north than Greeks in Greece proper.


* The Mughals were Central Asian Turks, while the Safavids were mostly Azeri Turks.

** Since the Turks who arrived in Anatolia had long sojourned in Turn and Iran it is important not to assume that their contribution is limited only to the East Asian component of ancestry.

February 22, 2018

Mesolithic and Paleolithic, Of Cheddar and Bread

Filed under: Cheddar Man,Historical Population Genetics,Neolithic — Razib Khan @ 7:27 pm

It’s been a big week for “Cheddar Man” and the science around him. I already talked about the issue blog-wise for my day job. Additionally, Spencer and I did a podcast on the topic (if you haven’t, please subscribe and leave positive reviews and ratings on iTunes and Stitcher; next we’ll post our conversation with Chris Stringer, don’t miss it!).

So at this point I’ll put some other thoughts here that are “big picture.”

Cheddar Man may have been black but probably wasn’t

Much of the media is focused on the predicted pigmentation of Cheddar Man. That is, dark. Back when the La Brana Western Hunter-Gatherer results came in with the same finding, several population genomics people pointed out that it might not be valid to predict their phenotype based on modern training sets.

Here are some thoughts:

  • Cheddar Man and the WHG in general were probably darker than modern Northern Europeans. There is detectable selection in modern Europeans for pigmentation alleles down to the present, and Northern Europeans are the palest people in the world. And, pigmentation is polygenic, but it’s not hyperpolygenic. That’s why GWAS and early selection tests picked up pigmentation loci as hits so often.
  • Cheddar Man and the WHG in general were probably not as dark as tropical people. The only people who live(d) at very high latitudes who were very darkly complected were Tasmanian Aboriginals and Australian Aboriginals (Melbourne is at the same latitude south as Lisbon is north). In contrast, we see that Khoisan are brown, sometimes rather lightly so, while the peoples of non-European heritage who live in high latitudes are not dark-skinned, though they are not as light-skinned as Europeans.

We don’t have a time machine, so we won’t know with finality. But, it seems that pigmentation pathways are finite, and eventually we can probably be more confident if Cheddar Man had a genetic architecture that would lead to fewer and smaller melanocytes.

The First Farmers replaced WHG to a great extent in Britain

The preprint that came out with the Cheddar Man documentary really focused mostly on the Neolithic farmers. The data set was large, and it emphasized that the discontinuity between the farmers, who were EEF from Anatolian stock (modern Sardinians are their best proxies), the hunter-gatherers. WHG is genetically homogeneous, so they couldn’t reject the proposition that there was no admixture of British hunter-gatherers into the farmer population Basically, the thesis that Peter Bellwood outlined in First Farmers is well supported by these results. The farmers brought agricullture, and pushed aside or absorbed the hunter-gatherers.

It is notable to me that they found more hunter-gatherer ancestry (possibly) in eastern and northern populations, but not much in farmers from Wales. Additionally, though they couldn’t be definitive about it, the EEF settlers of Britain seem to have more affinities with the Western Mediterranean populations than the Central European ones. This suggests that perhaps the farmers arrived by sea or coast-hugging from the south and west, rather than from the south and east.

The arrival of farming to Britain was different

Farmers came to Britain later than to the continent. The shift from hunter-gatherer to farming was rapid. One model for why there was lack of admixture is that the farming cultural package was fully adapted to Northern Europe by the time they began settling the island. In contrast, on the mainland farmers were changing a Middle Eastern lifestyle into something that could take root in cold northern climes where there were already local residents.

Sometimes cultural and ecological changes drive rapid expansions of human populations

Today Europe, and much of Western Eurasia, is characterized by isolation by distance dynamics between populations. What you see in the transition from the Mesolithic to the Neolithic, and later with the arrival of metal age populations (Bell Beakers), is that populations can turnover fast, and that rapid expansion and growth can result in homogeneity across huge distances and then sharp continuities across cultural divides. The classical example of this is that hunter-gatherers and farmers in Central Europe did not exchange much in the way of genes for centuries, and their between population variance accounted for ~10% of their pooled variance (this is what you see comparing Han and Europeans). Additionally, WHG and EEF are both relatively homogeneous, at least before the latter began to absorb WHG at different fractions across its range. WHG descends from a late Pleistocene expansion, after the Last Glacial Maximum. Similarly, the EEF expanded rapidly from its Anatolian point of origin.

Britons didin’t become Britons genetically until the Bronze Age

Ten years ago many people thought at Cheddar Man and his people were the ancestors of most of the people who lived in Britain today. At the same time as this preprint came out, the Bell Beaker paper was officially published. We now know that Britain went through two massive demographic transitions in less than 2,000 years, with on the order of a 90% replacement in a few centuries both times.

Why? Was this typical? Those are for a later post….

Older Posts »

Powered by WordPress