Razib Khan One-stop-shopping for all of my content

June 24, 2017

Indian genetic history: before the storm

Filed under: Genetics,History,India — Razib Khan @ 2:52 pm

Over at Brown Pundits I’ve mentioned the continuing simmer of controversy over a recent piece, How genetics is settling the Aryan migration debate. This has prompted responses in the Indian media from a Hindu nationalist perspective. One of these notes that the author of the piece above cites me, and then goes on to observe I was fired from The New York Times a few years ago due to accusations of racism (also, there is the implication that I’m just a blogger and we should trust researchers with credibility like Gyaneshwer Chaubey; well, perhaps he should know that Gyaneshwer Chaubey considers me “unbiased” according to an email exchange which I had with him last week [we all have biases, so I think he’s wrong in a literal sense]).

I was a little surprised that a right-wing magazine would lend legitimacy to the slanders of social justice warriors, but this is the world we live in. Those who believe that everything written about me in the media, I invite you to submit your name and background to me. I have contacts in the media and can get things written if I so choose. Watch me write something which is mostly fact, but can easily misinterpreted by those who Google you, and watch how much you value the objective “truth-telling” power of the press.

There’s a reason so many of us detest vast swaths of the media, though to be fair we the public give people who don’t make much money a great deal of power to engage in propaganda. Should we be surprised they sensationalize and misrepresent with no guilt or shame? I have seen most of those who snipe at me in the comments disappear once I tell them that I know what their real identity is. Most humans are cowards. I have put some evidence into the public record to suggest that I’m not.

Perhaps more strange for me is that the above piece was passed around favorably by Sanjeev Sanyal, who I was on friendly terms with (we had dinner & drinks in Brooklyn a few years back). I asked him about the slander in the piece and he unfollowed me on Twitter (a friend of Hindu nationalist bent asked Sanjeev on Facebook about the articles’ attack on me, but the comment was deleted). It shows how strongly people feel about these issues.

I’m in a weird position because I’m brown and have a deep interest in Indian history. But that interest in Indian history isn’t because I’m brown, I’m pretty interested in all the major zones of the Old World Oikoumene. Aside from some jocular R1a1a chauvinism I don’t have much investment personally (I just told said Hindu nationalist friend who turns out to be R2 to clean my latrine; joking of course, though I’m sure he resents that I’m descended on the direct paternal line from the All-Father & Lord of the Steppes and he is not!).

In the aughts I accepted the model outlined in 2006’s The Genetic Heritage of the Earliest Settlers Persists Both in Indian Tribal and Caste Populations. But to be frank it always struck me as a little confusing because the tentative autosomal data we had suggested that many South Asians were closer to West Eurasians than deep divergences dating to the Last Glacial Maximum would suggest. Since I’ve written something like 5 million words in 15 years, I actually can check if I’m remembering correctly. So here’s a post from 2008 where I express reservations of the idea of long term deep heritage of Indians separate from other West Eurasians. The reason I was so impressed by 2009’s Reconstructing Indian Population History is that it resolved the paradox of South Asian genetic relatedness.

To recap, Reich et al. proposed that modern Indians (South Asians) could be modeled as a two way mixture between two distinct populations with separate evolutionary genetic histories, Ancestral North Indians and Ancestral South Indians (ANI and ASI). How distinct? ANI were basically another West Eurasian population, while ASI was likely nested in the clade with Eastern Non-Africans. Additionally, there was a NW-to-SE and caste admixture cline. In other words, the higher you were on the caste ladder the more ANI you had, and the further your ancestors were from the north and west, and more ANI you had. The difference between Y and mtDNA, male and female, could be explained by sex-biased migration.

But there were still aspects of the paper which I had reservations about. After all, it was a model.

  • Models are imperfect fits onto reality. The idea of mass migration seemed ridiculous to me at the time, because even by the time of the Classical Greeks it was noted that Indian was reputedly the most populous land in the world (to their knowledge). But ancient DNA has convinced me of the reality of mass migrations.
  • I wasn’t sure about the nature of the closest modern populations to the ANI. The researchers themselves (in particular, Nick Patterson) told me that the relatedness of ANI to Europeans was very close (on the order of intra-European differences). But modern Indians do not look to be descended from a population that is half Northern European physically. Again, ancient DNA has shown that there was lots of population turnover, and it turns out that Europeans and ANI were likely both compounds and mixed daughter populations of common ancestors (also, typical European physical appearance seems to emerged in situ over the past 5,000 years).
  • The two way admixture modeled seemed too simple. I had run some data and it struck me that North Indian populations like Jats had something different than South Indian groups like Pulayars. In 2013 Priya Moorjani’s paper pretty much confirmed that it was more than a two way admixture along the ANI-ASI cline.

This March BMC Evolution Biology published Silva et al’s A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals. It has made a huge splash in India, arguably triggering the write up in The Hindu. But for me it was a bit ho-hum. If you read my 2008 post it is pretty clear that I suspected the most general of the findings in this paper at least 10 years back. It is nice to get confirmation of what you suspect, but I’m more interested to be surprised by something novel.

Nevertheless A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals has come in for lots of repeated attack in the right-wing Indian press. This is unfair, because it is a rather good paper. I suspect that it wasn’t published in a higher ranked journal because most scientists don’t consider the history of India to be that important, and they didn’t really apply new methods, as opposed to bringing a bunch of data and methods together (in contrast, the 2009 Reich et al. paper was one of the first publications which showed how to utilize “ghost populations” in explicit phylogenetic models with relevance to human demographic history).

As it happens I will be writing up my thoughts in detail in an article for a major Indian publication (similar circulation numbers as The Hindu). This has been in talks for over six months, but I’ve been busy. But a month or so ago I thought it was time that I put something into print for the Indian audience, because I felt there was some misrepresentation going on (i.e., the Aryan invasion theory has been been refuted by genetics, but this is what many Indians assert).

For any years people have told me there are certain topics that shouldn’t be talked about. I have offended people greatly. There are many things people do not want to know. I have come to the conclusion this is not an entirely indefensible viewpoint (though if you accept this viewpoint, I think acceptance of authoritarianism is inevitable, so I hope people will toe the line when the new order arrives; knowing their personalities I think they will conform fine). But my nature is such that I continue to have nothing but contempt for the duplicitous and craven manner in which people go about these sorts of private conversations. I assume that as someone with the name “Razib Khan” I will be attacked vociferously by Hindu nationalists, who will no doubt make recourse to the Left-wing hit pieces against me to undermine my credibility. The fact that these groups are fellow travelers should tell us something, though I will leave that as an exercise for the reader.

I will write my piece that reflects the science as I believe it is, without much consideration of the attacks. That is rather easy for me to do in part because I live in the United States, where denigrating the deeply held views and self-esteem of Hindu nationalists is not sensitive or politically protected (unlike say, Muslims). And Hindu nationalists are less likely to kill me by orders of magnitude than Muslim radicals, and they have far less purchase in this nation then the latter (though you may be interested to know that very conservative Muslims follow me on Twitter; they’re actually more open-minded than many SJWs to be entirely honest).

Let me go over some general points that I see coming up over and over on the relationship between Indian (pre)history and genetics in the critiques .

One of the major critiques has to do with the nature of R1a-Z93 and its subclades. Basically this Y chromosomal haplogroup, the greatest that has ever been known, exhibits a strong signature of very rapid expansion over the past 4,000 years or so. It is divided from Z282. While Z93 is found in South Asia, Central Asia, and Siberia, Z282 is European, with its dominant subclade the one associated with Eastern Europeans. Both of these clades of R1a have gone through massive expansion. In the Altai region R1a is 40% of the heritage of peoples who are now predominantly East Eurasian today. But they are Z93. Additionally, ancient DNA from the Pontic Steppe dated ~4,000 years ago from Srubna remains is Z93, as are Scythian remains from the Iron Age.

Much of the argument comes down to dating, and citing papers that give deep coalescence numbers between difference branches of R1a1a. Hindu nationalists and their fellow travelers point to recent papers which give dates >10,000 years ago, and so place the origin of Z93 plausibly in the Pleistocene. The problem is that Y chromosomal coalescence dating is something of a mug’s game. Often they use microsatellite data whose mutational rates are highly uncertain. In contrast, using SNP data, which has a slower mutation rate but requires a lot more data, you get TRMCA (common ancestry) between Z93 and Z282 around ~5,800 years ago. But coalescence estimates often have wide confidence intervals of thousands of years. And even with these intervals, the assumptions you make (e.g., mutation rate) strongly influence your midpoint estimate.

The Y chromosomal data is powerful, but its interpretation is still buttressed upon other assumptions. The really big picture framework is the nature of ancient genome-wide variation across Eurasia. Lazaridis et al. 2016 condition us to a prior where much of Eurasia was subject to massive population-wide genetic changes since the Holocene. Therefore, I am much less surprised if there was massive genetic change in India relatively recently. The methods in Priya Moorjani’s paper and in other publications make it obvious that mixture was extensive in South Asia between very distinct groups until about ~2,000 years ago. In fact, Moorjani et al. using patterns of variation across the genome to come at a number of two to four thousand years ago as the period of massive admixture.

Though we don’t have relevant ancient DNA from India proper to answer any questions yet, we do have ancient DNA from across much of Europe, Central Asia, and the Near East. What they show is that Indian populations share ancestry from both Neolithic Iranians and peoples of the Pontic steppe, who flourished ~5 to ~10,000 years ago. To some extent the latter population is a daughter population of the former…which makes things complicated. Conversely, no West Eurasian population seems to harbor ancient signals of ASI ancestry.

One scientist who holds to the position that most South Asian ancestry dates to the Pleistocene argued to me that we don’t know if ancient Indian samples from the northwest won’t share even more ancestry than the Iranian Neolithic and Pontic steppe samples. In other words, ANI was part of some genetic continuum that extended to the west and north. This is possible, but I do not find it plausible.

The reasons are threefold. First, it doesn’t seem that continuous isolation-by-distance works across huge and rugged regions of Central Eurasia. Rather, there are demographic revolutions, and then relative stasis as the new social-cultural environment crystallizes. This inference I’m making from ancient DNA and extrapolating. This may be wrong, but I would bet I’m not off base here.

Second, it strikes me as implausible that there was literally apartheid between ASI and ANI populations for the whole Holocene right up until ~4,000 years before the present. That is, if Northwest India was involved in reciprocal gene flow with the rest of Eurasia over thousands of years I expect there should have been some distinctive South Asian ASI-like ancestry in the ancient DNA we have. We do not see it.

Third, one of the populations with strong affinities to some Indian populations are those of the Pontic steppe. But we know that this group itself is a compound of admixture that arose 5,000-6,000 years ago. Because of the complexity of the likely population model of ANI this is not definitive, but it seems strange to imagine that ANI could have predated one of the populations with which it was in genetic continuum as part of a quasi-panmictic deme.

Finally, many of the critiques involve evaluation of the scientific literature in this field. Unfortunately this is hard to do from the outside. Citing papers from the aughts, for example, is not wrong, but evolutionary human population genomics is such a fast moving field that even papers published a few years ago are often out of date.

Many are citing a 2012 paper by a respected group which argues for the dominant model of the aughts (marginal population movement into South Asia). One of their arguments, that Central Asian migrant should have East Asian ancestry, is a red herring since it is well known that this dates to the last ~2,000 years or so (we know more now with ancient DNA). But the second point that is more persuasive in the paper is that when they look at local ancestry of ANI vs. ASI in modern Indians, the ANI haplotypes are more diverse than West Eurasians, indicating that they are  not descendants but rather antecedents (usually the direction of ancestry is from more divers to less due to subsampling).

There are two points that I have make here. First, local ancestry analysis is difficult, so I would not be surprised if they integrated ASI regions into ANI and so elevated the diversity in that way (though they think they’ve taken care of it in the paper). Second, if the ANI are a compound of several West Eurasian groups then we expect them to be more diverse than their parents. In other words, the paper is refuting a model which is almost certainly incorrect, but the alternative hypothesis is not necessarily the one they are supporting within the paper.

But there are many things we do not know still. Many free variables which we haven’t nailed down. Here are some major points:

  • Y chromosomal lineages have a correlation with ethno-linguistic groups, but the correlation is imperfect. R1b and R1a seems correlated with Indo-European groups, but both these are found in high proportions in groups which are putatively most “pre-Indo-European” in origin (e.g., Basques, Sardinians, and South Indian tribals and non-Brahmin Dravidian speaking groups). Also, haplogroups like I1 in Europe expands with Indo-Europeans locally, suggesting there was lots of heterogeneity in Indo-Europeans as they expanded. In other words, Indo-European expansion in relation to powerful paternal lineages did not always correlate with ethno-linguistic change.
  • There are probably at minimum two Holocene intrusions from the northwest into South Asia, but this is a floor. The models that are constructed always lack power to detect more complexity. E.g., it is not impossible that there were several migrations of Indo-Europeans into South Asia which we can not distinguish genetically over a period of a few thousand years.
  • If one looks over all of South Asia it may be that ASI ancestry in totality is >50% of the total genome ancestry. I haven’t have a good guess of the numbers. If this is correct, perhaps most South Asian ancestors 10,000 years ago were living in South Asia (though the fertility rate are such in Pakistan that ANI ancestry is increasing right now in relative rates).
  • But, this presupposes that ASI were present in South Asia in totality 10,000 years ago, rather than being migrants themselves. If ancient DNA confirms that ANI were long present in Northwest India, I hold then it is entirely likely that ASI was intrusive to South Asia! The BMC Evolutionary Biology Paper does a lot of interpretation of deep structure in haplogroup M in South Asia. I’m moderately skeptical of this. Europe may not be a good model for South Asia, but there we see lots of Pleistocene turnover.

So where does this leave us? Ancient DNA will answer a lot of questions. Pretty much all scientists I’ve talked to agree on this. My predictions, some of which I’ve made before:

  1. The first period of admixture is old, and dates to the founding of Mehrgarh as an agricultural settlement. The dominant ANI component dates to this period and mixture event, all across South Asia. The presence in South India is due to expansion of these farming populations.
  2. A second admixture event occurred with the arrival of steppe people. Those who argue for the Aryan invasion model posit 1500 BCE as the date. But these people probably were expanding in some form before this date.
  3. We still don’t know who the antecedents for the Indo-Aryans were. Probably they were a compound of different steppe groups, and also other populations which were mixed in (by analogy, in Europe it is obvious now that there was some mixture with the local European farmers and hunter-gatherers as Europeans expanded their frontier westward; the same probably applies for Indo-Aryans are the BMAC).

June 19, 2017

Indian genetics, the never-ending argument

Filed under: Genetics,India,Indian Genetics,Indo-Europeans,science — Razib Khan @ 10:44 pm

I am at this point somewhat fatigued by Indian population genetics. The real results are going to be ancient DNA, and I’m waiting on that. But people keep asking me about an article in Swarajya, Genetics Might Be Settling The Aryan Migration Debate, But Not How Left-Liberals Believe.

First, the article attacks me as being racist. This is not true. The reality is that the people who attack me on the Left would probably attack magazines like Swarajya as highly “problematic” and “Islamophobic.” They would label Hindu nationalism as a Nazi derivative ideology. People should be careful the sort of allies they make, if you dance with snakes they will bite you in the end. Much of the media lies about me, and the Left constantly attacks me. I’m OK with that because I do believe that the day will come with all the ledgers will be balanced. The Far Left is an enemy of civilization of all stripes. I welcome being labeled an enemy of barbarians. My small readership, which is of diverse ideologies and professions, is aware of who I am and what I am, and that is sufficient. Either truth or power will be the ultimate arbiter of justice.

With that out of the way, there this one thing about the piece that I think is important to highlight:

To my surprise, it turned out that that Joseph had contacted Chaubey and sought his opinion for his article. Chaubey further told me he was shocked by the drift of the article that appeared eventually, and was extremely disappointed at the spin Joseph had placed on his work, and that his opinions seemed to have been selectively omitted by Joseph – a fact he let Joseph know immediately after the article was published, but to no avail.

Indeed, this itself would suggest there are very eminent geneticists who do not regard it as settled that the R1a may have entered the subcontinent from outside. Chaubey himself is one such, and is not very pleased that Joseph has not accurately presented the divergent views of scholars on the question, choosing, instead to present it as done and dusted.

I do wish Tony Joseph had quoted Gyaneshwer Chaubey’s response, and I’d like to know his opinions. Science benefits from skepticism. Unfortunately though the equivocation of science is not optimal for journalism, so oftentimes things are presented in a more stark and clear manner than perhaps is warranted. I’ve been in this position myself, when journalists are just looking for a quote that aligns with their own views. It’s frustrating.

There are many aspects of the Swarajya piece I could point out as somewhat weak. For example:

The genetic data at present resolution shows that the R1a branch present in India is a cousin clade of branches present in Europe, Central Asia, Middle East and the Caucasus; it had a common ancestry with these regions which is more than 6000 years old, but to argue that the Indian R1a branch has resulted from a migration from Central Asia, it should be derived from the Central Asian branch, which is not the case, as Chaubey pointed out.

The Srubna culture, the Scythians, and the people of the Altai today, all bear the “Indian” branch of R1a. First, these substantially post-date 6000 years ago. I think that that is likely due to the fact that South Asian R1a1a-Z93 and that of the Sbruna descend from a common ancestor. But in any case, the nature of the phylogeny of Z93 indicates rapid expansion and very little phylogenetic distance between the branches. Something happened 4-5,000 years ago. One could imagine simultaneous expansions in India and Central Asia/Eastern Europe. Or, one could imagine an expansion from a common ancestor around that time. The latter seems more parsimonious.

Additionally, while South Asians share ancestry with people in West Asia and Eastern Europe, these groups do not have distinctive South Asian (Ancestral South Indian) ancestry. This should weight out probabilities as to the direction of migration.

Second, I read some of the papers linked to in the article, such as Shared and Unique Components of Human Population Structure and Genome-Wide Signals of Positive Selection in South Asia and Y-chromosomal sequences of diverse Indian populations and the ancestry of the Andamanese. The first paper has good data, but I’ve always been confused by the interpretations. For example:

A few studies on mtDNA and Y-chromosome variation have interpreted their results in favor of the hypothesis,70–72 whereas others have found no genetic evidence to support it.3,6,73,74 However, any nonmarginal migration from Central Asia to South Asia should have also introduced readily apparent signals of East Asian ancestry into India (see Figure 2B). Because this ancestry component is absent from the region, we have to conclude that if such a dispersal event nevertheless took place, it occurred before the East Asian ancestry component reached Central Asia. The demographic history of Central Asia is, however, complex, and although it has been shown that demic diffusion coupled with influx of Turkic speakers during historical times has shaped the genetic makeup of Uzbeks75 (see also the double share of k7 yellow component in Uzbeks as compared to Turkmens and Tajiks in Figure 2B), it is not clear what was the extent of East Asian ancestry in Central Asian populations prior to these events.

Actually the historical and ancient DNA evidence both point to the fact that East Asian ancestry arrived in the last two thousand years. The spread of the first Gokturk Empire, and then the documented shift in the centuries around 1000 A.D. from Iranian to Turkic in what was Turan, signals the shift toward an East Asian genetic influx. Alexander the Great and other Greeks ventured into Central Asia. The people were described as Iranian looking (when Europeans encountered Turkic people like Khazars they did note their distinctive physical appearance).

We have ancient DNA from the Altai, and those individuals initially seemed overwhelmingly West Eurasian. Now that we have Scythian ancient DNA we see that they mixed with East Asians only on the far east of their range.

The second paper is very confused (or confusing):

The time divergence between Indian and European Y-chromosomes, based on the closest neighbour analysis, shows two different distinctive divergence times for J2 and R1a, suggesting that the European ancestry in India is much older (>10 kya) than what would be expected from a recent migration of Indo-European populations into India (~4 to 5 kya). Also the proportions suggest the effect might be less strong than generally assumed for the Indo-European migration. Interestingly, the ANI ancestry was recently suggested to be a mix of ancestries from early farmers of western Iran and people of the Bronze Age Eurasian steppe (Lazaridis et al. 2016). Our results agree with this suggestion. In addition, we also show that the divergence time of this ancestry is different, suggesting a different time to enter India.

Lazaridis et al. accept a mass migration from the steppe. In fact, the migration is to such a magnitude that I’m even skeptical. Also, there couldn’t have been a European migration to South Asia during the Pleistocene because Europeans as we understand them genetically did not exist then!!!

I assume that many of the dates of coalescence are sensitive to parameter conditions. Additionally, they admit limitations to their sampling.

Ultimately the final story will be more complex than we can imagine. R1a is too widespread to be explained by a simple Indo-Aryan migration in my opinion. But we can’t get to these genuine conundrums if we keep having to rebut ideologically motivated salvos.

Related: Ancient herders from the Pontic-Caspian steppe crashed into India: no ifs or buts. I wish David would be a touch more equivocal. But I have to admit, if the model fits, at some point you have to quit.

June 17, 2017

Indian media is finally reporting on the Aryan migration into South Asia

Filed under: Genetics,science — Razib Khan @ 2:49 pm

For various ideological reasons in India there has been a strong resistance to the idea that Aryans came from outside of South Asia. When David Reich’s Reconstructing Indian Population History was published 2009 the Indian media had a weird response. For example, Aryan-Dravidian divide a myth: Study.

Though Reich’s paper was equivocal, it was clear to me that it was likely going to be the launching point for a resurrection of the Aryan migration theory. Now Tony Joseph in The Hindu has published a pretty good survey of the literature, How genetics is settling the Aryan migration debate. Nothing new for readers of this weblog, but he some good quotes:

The avalanche of new data has been so overwhelming that many scientists who were either sceptical or neutral about significant Bronze Age migrations into India have changed their opinions. Dr. Underhill himself is one of them. In a 2010 paper, for example, he had written that there was evidence “against substantial patrilineal gene flow from East Europe to Asia, including to India” in the last five or six millennia. Today, Dr. Underhill says there is no comparison between the kind of data available in 2010 and now. “Then, it was like looking into a darkened room from the outside through a keyhole with a little torch in hand; you could see some corners but not all, and not the whole picture. With whole genome sequencing, we can now see nearly the entire room, in clearer light.”

In relation to online debates I have had Indian interlocutors tell me flat out that they believe in the papers published between 2005 and 2010. It is nice to get the scientists who actually published this work now admit that new results overturn the older theories.

Note: I am going to refer to this as a migration, because “invasion” seems to connote too much specificity as to how it happened. But I have a difficult time imagining that it was a peaceful process.

June 14, 2017

The fad for dietary adaptations is not going away

Filed under: Diet,FADS,Genetics,Human Genetics — Razib Khan @ 7:21 pm


Food is a big deal for humans. Without it we die. Unlike some animals (here’s looking at you pandas) we’re omnivorous. We eat fruit, nuts, greens, meat, fish, and even fungus. Some of us even eat things which give off signals of being dangerous or unpalatable, whether it be hot sauce or lutefisk.

This ability to eat a wide variety of items is a human talent. Those who have put their cats on vegetarian diets know this. After a million or so years of being hunters and gatherers with a presumably varied diet for thousands and thousands of years most humans at any given time ate some form of grain based gruel. Though I am sympathetic to the argument that in terms of quality of life this was a detriment to median human well being, agriculture allowed our species to extract orders of magnitude more calories from a unit of land, though there were exceptions, such as in marine environments (more on this later).

Ergo, some scholars, most prominently Peter Bellwood, have argued that farming did not spread through cultural diffusion. Rather, farmers simply reproduced at much higher rates because of the efficiency of their lifestyle in comparison to that of hunter-gatherers. The latest research, using ancient DNA, broadly confirms this hypothesis. More precisely, it seems that cultural revolutions in the Holocene have shaped most of the genetic variation we see around us.

But genetic variation is not just a matter of genealogy. That is, the pattern of relationships, ancestor to descendent, and the extent of admixtures across lineages. Selection is also another parameter in evolutionary genetics. This can even have genome-wide impacts. It seems quite possible that current levels of Neanderthal ancestry are lower than might otherwise have been the case due to selection against functional variants derived from Neanderthals, which are less fitness against a modern human genetic background.

The importance of selection has long been known and explored. Sickle-cell anemia only exists because of balancing selection. Ancient DNA has revealed that many of the salient traits we associate with a given population, e.g., lactose tolerance or blue eyes, have undergone massive changes in population wide frequency over the last 10,000 years. Some of this is due to population replacement or admixture. But some of it is due to selection after the demographic events. To give a concrete example, the frequency of variants associated with blue eyes in modern Europeans dropped rapidly with the expansion of farmers from the Near East ~10,000 years ago, but has gradually increased over time until it is the modal allele in much of Northern Europe. Lactase persistence in contrast is not an ancient characteristic which has had its ups and downs, but something new that evolved due to the cultural shock of the adoption of dairy consumption by humans as adults. The region around lactase is one of the strongest signals of natural selection in the European genome, and ancient DNA confirms that the ubiquity of the lactase persistent allele is a very recent phenomenon.

But obviously lactase is not going to be the only target of selection in the human genome. Not only can humans eat many different things, but we change our portfolio of proportions rather quickly. In a Farewell to Alms the economic historian Gregory Clark observed that English peasants ate very differently before and after the Black Death. As any ecologist knows populations are resource constrained when they are near the carrying capacity, and England during the High Medieval period there was massive population growth due to gains in productivity (e.g., the moldboard plough) as well as intensification of farming and utilization of all the marginal land.

After the Black Death (which came in waves repeatedly) there was a massive population decline across much of Europe. Because institutions and practices were optimized toward maintaining a much higher population, European peasants lived a much better lifestyle after the population crash because the pie was being cut into far fewer pieces. In other words, centuries of life on the margins just scraping by did not mean that English peasants couldn’t live large when the times allowed for it. We were somewhat pre-adapted.

Our ability to eat a variety of items, and the constant varying of the proportions and kind of elements which go into our diet, mean that sciences like nutrition are very difficult. And, it also means that attempts to construct simple stories of adaptation and functional patterns from regions of the genome implicated in diet often fail. But with better analytic technologies (whole genome sequencing, large sample sizes) and some elbow grease some scientists are starting to get a better understanding.

A group of researchers at Cornell has been taking a closer look at the FADS genes over the past few years (as well as others at CTEG). These are three nearby genes, FADS1FADS2, and FADS3 (they probably underwent duplication). These genes are involved in the metabolization of fatty acids, and dietary regime turns out to have a major impact on variation around these loci.

The most recent paper out of the Cornell group, Dietary adaptation of FADS genes in Europe varied across time and geography:

Fatty acid desaturase (FADS) genes encode rate-limiting enzymes for the biosynthesis of omega-6 and omega-3 long-chain polyunsaturated fatty acids (LCPUFAs). This biosynthesis is essential for individuals subsisting on LCPUFA-poor diets (for example, plant-based). Positive selection on FADS genes has been reported in multiple populations, but its cause and pattern in Europeans remain unknown. Here we demonstrate, using ancient and modern DNA, that positive selection acted on the same FADS variants both before and after the advent of farming in Europe, but on opposite (that is, alternative) alleles. Recent selection in farmers also varied geographically, with the strongest signal in southern Europe. These varying selection patterns concur with anthropological evidence of varying diets, and with the association of farming-adaptive alleles with higher FADS1 expression and thus enhanced LCPUFA biosynthesis. Genome-wide association studies reveal that farming-adaptive alleles not only increase LCPUFAs, but also affect other lipid levels and protect against several inflammatory diseases.

The paper itself can be difficult to follow because they’re juggling many things in the air. First, they’re not just looking at variants (e.g., SNPs, indels, etc.), but also the haplotypes that the variants are embedded in. That is, the sequence of markers which define an association of variants which indicate descent from common genealogical ancestors. Because recombination can break apart associations one has to engage with care in historical reconstruction of the arc of selection due to a causal variant embedded in different haplotypes.

But the great thing about this paper is that in the case of Europe they can access ancient DNA. So they perform inferences utilizing whole genomes from many extant human populations, but also inspect change in allele frequency trajectories over time because of the density of the temporal transect. The figure to the left shows variants in both an empirical and modeling framework, and how they change in frequency over time.

In short, variants associated with higher LCPUFA synthesis actually decreased over time in Pleistocene Europe. This is similar to the dynamic you see in the Greenland Inuit. With the arrival of farmers the dynamic changes. Some of this is due to admixture/replacement, but some of it can not be accounted for admixture and replacement. In other words, there was selection for the variants which synthesize more LCPUFA.

This is not just limited to Europe. The authors refer to other publications which show that the frequency of alleles associated with LCPUFA production are high in places like South Asia, notable for a culture of preference for plant-based diets, as well as enforced by the reality that animal protein was in very short supply. In Europe they can look at ancient DNA because we have it, but the lesson here is probably general: alternative allelic variants are being whipsawed in frequency by protean shifts in human cultural modes of production.

In War Before Civilization Lawrence Keeley observed that after the arrival of agriculture in Northern Europe in a broad zone to the northwest of the continent, facing the Atlantic and North Sea, farming halted rather abruptly for centuries. Keeley then recounts evidence of organized conflict in between two populations across a “no man’s land.”

But why didn’t the farmers just roll over the old populations as they had elsewhere? Probably because they couldn’t. It is well known that marine regions can often support very high densities of humans engaged in a gathering lifestyle. Though not farmers, these peoples are often also not nomadic, and occupy areas as high density. The tribes of the Pacific Northwest, dependent upon salmon fisheries, are classic examples. Even today much of the Northern European maritime fringe relies on the sea. High density means they had enough numbers to resist the human wave of advance of farmers. At least for a time.

Just as cultural forms wane and wax, so do some of the underlying genetic variants. If you dig into the guts of this paper you see much of the variation dates to the out of Africa period. There were no great sweeps which expunged all variation (at least in general). Rather, just as our omnivorous tastes are protean and changeable, so the genetic variation changes over time and space in a difficult to reduce manner. The flux of lifestyle change is probably usually faster than biological evolution can respond, so variation reducing optimization can never complete its work.

The modern age of the study of natural selection in the human genome began around when A Map of Recent Positive Selection In the Human Genome was published. And it continues with methods like SDS, which indicate that selection operates to this day. Not a great surprise, but solidifying our intuitions. In the supplements to the above paper the authors indicate that the focal alleles that they are interrogating exhibit coefficients of selection around ~0.5% or so. This is rather appreciable. The fact that fixation has not occurred indicates in part that selection has reversed or halted, as they noted. But another aspect is that there are correlated responses; the FADS genes are implicated in many things, as the authors note in relation to inflammatory diseases. But I’m not sure that the selection effects of these are really large in any case. I bet there are more important things going on that we haven’t discovered or understood.

Obviously genome-wide analyses are going to continue for the foreseeable future. Ten years ago my late friend Mike McKweon predicted that at some point genomics was going to have be complemented by detailed follow up through bench-work. I’m not sure if we’re there yet, but there are only so many populations you can sequence, and only to a particular coverage to obtain any more information. Some selection sweeps will be simple stories with simple insights. But I suspect many more like FADS will be more complex, with the threads of the broader explanatory tapestry assembled publications by publication over time.

Citation: Ye, K., Gao, F., Wang, D., Bar-Yosef, O. & Keinan, A. Dietary adaptation of FADS genes in Europe varied across time and geography. Nat. Ecol. Evol. 1, 0167 (2017).

June 6, 2017

Origin of modern humanity pushed back 260,000 years BP (?)

Filed under: Ancient DNA,Genetics,Khosian,South Africa — Razib Khan @ 12:45 am


The above figure is from a preprint, Ancient genomes from southern Africa pushes modern human divergence beyond 260,000 years ago. The title and abstract are pretty clear:

Southern Africa is consistently placed as one of the potential regions for the evolution of Homo sapiens. To examine the region’s human prehistory prior to the arrival of migrants from East and West Africa or Eurasia in the last 1,700 years, we generated and analyzed genome sequence data from seven ancient individuals from KwaZulu-Natal, South Africa. Three Stone Age hunter-gatherers date to ~2,000 years ago, and we show that they were related to current-day southern San groups such as the Karretjie People. Four Iron Age farmers (300-500 years old) have genetic signatures similar to present day Bantu-speakers. The genome sequence (13x coverage) of a juvenile boy from Ballito Bay, who lived ~2,000 years ago, demonstrates that southern African Stone Age hunter-gatherers were not impacted by recent admixture; however, we estimate that all modern-day Khoekhoe and San groups have been influenced by 9-22% genetic admixture from East African/Eurasian pastoralist groups arriving >1,000 years ago, including the Ju|’hoansi San, previously thought to have very low levels of admixture. Using traditional and new approaches, we estimate the population divergence time between the Ballito Bay boy and other groups to beyond 260,000 years ago. These estimates dramatically increases the deepest divergence amongst modern humans, coincide with the onset of the Middle Stone Age in sub-Saharan Africa, and coincide with anatomical developments of archaic humans into modern humans as represented in the local fossil record. Cumulatively, cross-disciplinary records increasingly point to southern Africa as a potential (not necessarily exclusive) ‘hot spot’ for the evolution of our species.

These results in the outlines were actually presented at a conference. I saw it on Twitter and don’t remember which conference anymore. But this is not entirely surprising.

First, much respect to Mattias Jakobsson’s group for breaking through the Reich-Willerslev duopoly. Hopefully this presages some democratization of the ancient DNA field as expenses are going down.

Second, notice how in most cases ancient DNA shows that modern reference populations turn out to be admixed. This was the problem with much of Eurasia, and why using modern genetic variation to make inferences about the past totally failed.

I am entirely convinced that the genome from Ballito Bay dating to ~2,000 years does not carry the Eurasian inflected East African admixture. The Mota genome implies that Eurasian admixture did not come to eastern Africa much before 4,500 years ago. There needs to be a much deeper big picture analysis of the archaeology of Africa and the genetic information we have to get a sense of what happened back then…but, it seems likely that the Bantu migration has over-written much of the earlier genetic variation.

The fact that ancient genomes always show that our current populations are admixed makes me wonder if the Ballito Bay sample itself is admixed from more ancient populations. That is, if we found a genome from 20,000 years ago, would it be very different from the Ballito Bay samples? The relatively thick time transect from Europe indicates that turnover happens every 10,000 years or so. Australian Aborigines seem to have been resident in their current locations for ~50,000 years, but this seems the exception, not the rule. Do we really think that the ancestors of the Bushmen were living in southern Africa for five times as long as Australian Aborigines?

Another curious aspect of this paper is that it suggests the effective population size of Bushmen is smaller than we might have thought, and they’re somewhat less diverse than we’d thought. That’s because East African (with Eurasian ancestry) gene flow increased heterozygosity, as well as inferred effective population sizes. I’ve mentioned this effect on statistics before. Unless you have a true model of population history (or close to it) your assumptions might distort the numbers you get.

There is another aspect to this preprint mentioned glancingly in the text, and a bit more in the supplements: they seem to only be able to model Yoruba well if you assume that they themselves are a mix of “Basal Humans” (BH) and other African population which gave rise to East Africans and “Out of Africa” populations. Note that the BH seem to diverge from other human populations before the ancestors of Southern Africans like the Ballito Bay sample. That is, BH could push the diversification of the ancestors of modern humans considerably before 260,000 years before the present.

The possibility of deep structure in the Yoruba is pretty notable because they’ve been the gold standard in many human population genetic data sets as a reference population. But this is not result of deep structure is not entirely surprising. For years researchers have been hinting at confusing results in relation to the possibility of Eurasian back-migration. Perhaps the deep structure was confounding inferences?

The authors themselves are quite cautious about their dating of the divergence. It’s sensitive to many assumptions, and in particular the mutation rate being known and constant over time. But I think it’s hard to deny that this is pushing back the emergence of modern humans beyond what we know today. The earliest anatomically modern humans are found in Ethiopia 195,000 years ago from what I know. As I said, I’m convinced that the ancient genome has shown that modern “pristine” populations have some serious admixture. But I’m not as convinced about any specific point estimate, because that’s sensitive to a lot of assumptions which might not hold.

Finally, first a quick shout out to the blogger Dienekes. As early as ten years ago he anticipated the basic outlines of these sorts of results in the generality, if not the details. We really have come a long way from popular science declaring that all humans descend from a small group of East Africans who lived 50,000 to 100,000 years ago. The real picture was much more complex.

Also, I have to admit I considered titling this blogspot “Wolpoff’s revenge.” As in Milford Wolpoff. The reason being that we’re getting quite close to territory familiar to the much maligned multi-regionalist model of modern human origins.

Note: These findings should make us less surprised perhaps by a “modern” human migration before the primary one out of Africa.

June 2, 2017

The nadir of genetics in the Soviet Union

Filed under: Genetics,History — Razib Khan @ 8:05 pm

A fascinating excerpt in Slate from How to Tame a Fox (and Build a Dog), :

This skepticism of genetics all started when, in the mid-1920s, the Communist Party leadership elevated a number of uneducated men from the proletariat into positions of authority in the scientific community, as part of a program to glorify the average citizen after centuries of monarchy had perpetuated wide class divisions between the wealthy and the workers and peasants. Lysenko fit the bill perfectly, having been raised by peasant farmer parents in the Ukraine. He hadn’t learned to read until he was 13, and he had no university degree, having studied at what amounted to a gardening school, which awarded him a correspondence degree. The only training he had in crop-breeding was a brief course in cultivating sugar beets. In 1925, he landed a middle-level job at the Gandzha Plant Breeding Laboratory in Azerbaijan, where he worked on sowing peas. Lysenko convinced a Pravda reporter who was writing a puff piece about the wonders of peasant scientists that the yield from his pea crop was far above average and that his technique could help feed his starving country. In the glowing article the reporter claimed, “the barefoot professor Lysenko has followers … and the luminaries of agronomy visit … and gratefully shake his hand.” The article was pure fiction. But it propelled Lysenko to national attention, including that of Josef Stalin.

Sometimes it is easy to believe that the period in the Soviet Union under Stalin or in China under Mao or in Germany under Hitler, to name a few, were aberrations. But I think that’s the wrong way to look at it. The story of how Lysenko became influential hooks into so many historical tropes and psychological instincts of our species that we should be wary of it.

There have been great scholars without requisite qualifications. Ramanujan and Faraday come to mind. But great scholars are exceptional people. They are not average.

May 30, 2017

Ancient Egyptians: black or white?

Filed under: Egypt,Genetics,Historical Genetics,History — Razib Khan @ 9:20 pm

One of the most fascinating things about ancient Egypt is its continuity, and our granular and detailed knowledge of that continuity. We can thank in part the dry climate, as well as the Egyptian penchant for putting their hieroglyphs on walls and monuments (as well as graffiti!). And we can also thank the fact that both the ancient Greeks and Hebrews, Athens and Jerusalem so to speak, were deeply connected to and perceived themselves to be indebted to Egyptian civilization. Even before the translation of the Rosetta Stone and the deciphering of ancient Egyptian writing the Hebrews’ interactions with Egyptians, in particular in Exodus, mean that their memory would echo down through the millennia (the newly Christianized Irish interpolated Egyptian ancestry into their own genealogy).

The Greek relationship with Egypt was less fraught and at greater remove than the Hebrews. But the Classical period philosophers correctly perceived that Egyptian civilization was ancient, and preceded their own. Aegean-Egyptian connections were actually more longstanding than the Classical scholars knew, in Brotherhood of Kings: How International Relations Shaped the Ancient Near East, the correspondence in state archives which have been retrieved are rather clear that Minoan civilization was part of the orbit of Egypt early on. Though Egyptians never conquered the Aegean polities, mercantile and diplomatic connections were extremely old and persistent. The late Bronze Age eruption of barbarian Sea Peoples who attacked the whole civilized Near East may have been facilitated in part by the broad familiarity engendered by widespread trade networks.

The most recent book devoted to ancient Egypt I have read was Toby Wilkinson’s The Rise and Fall of Ancient Egypt. Synthesizing extensive written material with archaeology, perhaps the most impactful argument in Wikinson’s narrative was the persistence of the temple based institutions from the Old Kingdom down to the Ptolemaic era. Religious institutions carried on even with the shocks of Nubian and Libyan conquest in the post-New Kingdom period, down to Late Antiquity. The temple at Philae in southern Egypt was an active center of the traditional religion, and therefore the culture which dates to the Old Kingdom in continuous form, down to the 6th century A.D. (when it was closed by Justinian in his kulturkampf against ancient heterodoxies).

For various ideological reasons though many people are very curious about the racial characteristics of the ancient Egyptians. There are two basic extreme positions, Afrocentrists and Eurocentrists. Though I have not done a deep dive of the literature of either group, I’ve read a few books from either camp over my lifetime. In fact I believe the last time I read the “primary literature” of Afrocentrist and Eurocentrism was when I was an early teen, and it was rather strange because both groups seem to be recapitulating racial disagreements and viewpoints relevant to the American context, and projecting them back to the ancient world.

In college I stumbled upon Mary Lefkowitz’s Not Out Of Africa, a book length argument against the more sophisticated Afrocentrist views articulated in the wake of Martin Bernal’s Black Athena: The Afroasiatic Roots of Classical Civilization. Lefkowitz was a classicist, so many of her objections were exceedingly scholarly. The reality is that the best refutation of an Afrocentrist view of of ancient Egypt, which reduces to the idea that ancient Egyptians would be recognizably black African today, are the Fayum portraits. It is notable to me how similar these portraits are to modern Copts. In fact the actor Rami Malek, of Coptic background, looks strikingly like someone who stepped out of the Fayum portraits.

I have read no book length refutation of the Eurocentrist, usually Nordicist, perspective. Mostly because this is a view associated with white supremacism, and that ideology is generally attacked on normative, not positive, grounds. But the visible evidence of the Fayum portraits is a strong refutation of the Nordic model. Of course, there is the reality that we now know that the Nordic phenotype, and the genetic components which congealed into that typical of Northern Europe today, was only coming into existence when the Old Kingdom of Egypt was already a mature civilization.

Of course both Afrocentrists and Eurocentrists will reject the evidence of the Fayum portraits became they came from the Roman era, and they would argue that the demographic nature of Egyptians changed quite a bit between that period and the end of the New Kingdom. And they are not incorrect that the period between the arrival of the Romans and the fall of the New Kingdom was characterized by a great deal of change. There were Libyan dynasties, Nubian dynasties, and periods of rule by Assyrians, Persians, and Macedonians. Large colonies of Greeks, Macedonians, and Hebrews-becoming-Jews were also resident in Egypt. Especially, but not limited to, the urban areas.

But now we have ancient DNA! Ancient Egyptian mummy genomes suggest an increase of Sub-Saharan African ancestry in post-Roman periods:

Egypt, located on the isthmus of Africa, is an ideal region to study historical population dynamics due to its geographic location and documented interactions with ancient civilizations in Africa, Asia and Europe. Particularly, in the first millennium BCE Egypt endured foreign domination leading to growing numbers of foreigners living within its borders possibly contributing genetically to the local population. Here we present 90 mitochondrial genomes as well as genome-wide data sets from three individuals obtained from Egyptian mummies. The samples recovered from Middle Egypt span around 1,300 years of ancient Egyptian history from the New Kingdom to the Roman Period. Our analyses reveal that ancient Egyptians shared more ancestry with Near Easterners than present-day Egyptians, who received additional sub-Saharan admixture in more recent times. This analysis establishes ancient Egyptian mummies as a genetic source to study ancient human history and offers the perspective of deciphering Egypt’s past at a genome-wide level.

Because modern people care about the Afrocentrist question, the extent of Sub-Saharan African ancestry is highlighted in this paper. I do not think this is actually the most interesting aspect. But I’ll get to that. Since this post will be read by a fair number of people I’ll talk about the relationship of ancient and modern Egyptians to (Northern) Europeans and Sub-Saharan Africans.

The figure to the left is looking at 90 ancient Egyptian mitochondrial genomes (and some modern ones in the two rightmost columns). Since mtDNA is copious it was relatively easy to extract and analyze.  Haplogroup L, the red to orange shades in the bar plots, are associated without dispute with Sub-Saharan Africa. Haplogroup U6, M1 and a few others may be “back to Africa” variants of different periods (they are generally found in Afro-Asiatic groups).

What you can see is that somewhat more than half of Ethiopia’s mtDNA lineages are L, in keeping with the whole genome estimate of Sub-Saharan African ancestry in most Cushitic populations. In Egypt there is a difference over time; haplogroup L goes from low frequencies to much higher frequencies in modern periods. The ~20% fraction in the modern samples is in line with the population wide admixture one sees in modern Egyptians of Sub-Saharan admixture.

I actually recomputed the haplogroups to a finer granularity from the supplements for readers who know this stuff well. Here they are:

 

Haplogroup Count
H 2
H13c1 2
H5 2
H6b 2
HV 3
HV1a’b’c 4
HV1a2a 3
HV1b2 2
HV21 2
I 5
J1d 2
J2a1a1 2
J2a2b 2
J2a2c 4
J2a2e 3
K 16T 2
K1a 2
K1a4 2
L3 2
M1a1 4
M1a1e 2
M1a1i 2
M1a2a 2
N 2
N1’5 2
N1a1a2 2
R 3
R0 2
R0a 2
R0a1 2
R0a1a 3
R0a2 3
R0a2f 2
R2’JT 2
T 3
T1a 3
T1a2 2
T1a5 4
T1a7 7
T1a8a 2
T2 3
T2c1 2
T2c1c 2
T2e 2
U 2
U1a1 2
U1a1a3 2
U3b 3
U5a 2
U6a 2
U6a2 2
U6a3 2
U7 4
U8b1a1 3
U8b1b1 2
W3a1 2
W6 2
W8 2
X 2
X1 2
X1c 2

A quick inspection of mtDNA haplogroup frequencies shows that ancient Egyptians are not typical of modern Europeans. Not that much H, and lots of T, J and K. What that does remind me of are Early European Farmers. These people, who brought agriculture to Europe from Anatolia contributed a large fraction of the ancestry of modern Southern Europeans, and a lesser component to Northern Europeans.

But ultimately what’s great about this paper is that they have ancient autosomal DNA. That is, genome-wide results.

They got three samples of reasonably high quality. More precisely: “Two samples from the Pre-Ptolemaic Periods (New Kingdom to Late Period) had 5.3 and 0.5% nuclear contamination and yielded 132,084 and 508,360 SNPs, respectively, and one sample from the Ptolemaic Period had 7.3% contamination and yielded 201,967 SNPs.”

You can see the three samples on this bar plot. What is interesting is that they’re all pretty similar.

What you can see here is that to a great extent ancient Egyptians were descended from a population closely related to Natufians, or Natufians themselves. This easily explains the mtDNA affinity to Neolithic farmers: Natufians and Anatolian Neolithic populations were sister populations. The f3 statistic which looks at shared drift shows an affinity of ancient Egyptians with ancient farmer populations with Near Eastern provenance, but also with modern Sardinians. This is a common pattern, as ancient groups do not have later migration waves, with the Sardinians the modern population closest to this.

You see in the bar plot that northern Levantine populations are placed between Anatolian Neolithics and Natufians, as one might expect based on their geographical position and gene flow between these two regions. Additionally, the cyan color is associated with eastern farmers from the Zagros. I’ve already talked about gene flow from this area to the Levant recently. If you compare the Bronze Age Sidon samples I think you’ll see broad affinities with these Late Period Egyptians.

The PCA gives us results consonant with the model-based clustering. If you plot the genetic variation of ancient Egyptians they’re closest to Neolithic eastern Mediterranean populations. No great surprise.

Not the modern Egyptians. Why? It’s pretty clearly because modern Egyptians are shifted toward Sub-Saharan Africans. But there is also another component: modern Egyptians have more of the cyan eastern farmer component. What could this be?

An immediate thought comes to mind. We focus a great deal on Sub-Saharan African slavery. One reason is that it is visible. Black Africans are physically distinct from most Middle Eastern populations. But Egypt was long the center of another slave trade: “white slaves” from the Caucasus. Circassians. For hundreds of years Mamluks were recruited from the Caucasus as military slaves. They eventually became the ruling class of Egypt, until their decimation in the 19th century under Muhammad Ali (who himself was an Albanian Ottoman who never learned to speak Arabic well).

As noted in the paper earlier work looking at patterns in ancestry tracts and LD decay had made it obvious that much of the admixture of Sub-Saharan ancestry in Egypt, as in much of the Middle East, is relatively recent. In particular, it dates to the Islamic period, when trade and conquest took on new dimensions in Africa and north into Central Asia. One way ethnic minorities like Assyrians and Lebanese Christians differ from their Muslim neighbors is that they have much lower fractions of Sub-Saharan African ancestry, and no East Asian component. The latter might surprise, but remember that Central Asian Turkic slaves have been prominent in Muslim armies since at least the 9th century.

But some of the Sub-Saharan ancestry in Egyptians is old. The ancient Egyptian samples have it. To have none of it would seem strange, considering the history of contact between Nubia and Egypt, dating back to the Old Kingdom. Second, there is evidence of low levels of Sub-Saharan African gene flow into Southern Europeans. How did that happen? The highest fractions are in Spain, and can there be attributed to the Moorish period. But that explanation does not hold in much of Italy, where there are a few percent of haplogroup L. This probably is due to south-to-north gene flow across the Mediterranean during the Classical period. Some of the peoples on the south shore of the Mediterranean almost certainly already had some Sub-Saharan African admixture.

Not getting into the details of it, there are ways to explicitly model gene flow into a target population from donors defined by a phylogeny. In this case the authors tested various models of gene flow from Sub-Saharan Africans and Eurasians (non-Africans) to generate allele frequency patterns we see in modern Egyptians and ancient Egyptians.

What they consistently found is that modern Egyptians are about twice as much Sub-Saharan African as ancient Egyptians. The proportions for modern Egyptians ranged from ~10 to ~20 percent Sub-Saharan African against a Eurasian background, with a bias toward the higher values (depending on which populations you put into the phylogeny for non-Africans), and ~0 to ~10 percent for the ancient Egyptians, again with a bias toward the higher values. The pattern is consistent in these tests.

An issue here is that we’re going off three samples. That being said, the authors observe that despite differences in contamination/quality and time period they’re very concordant with each other. If I had to bet I think Old Kingdom samples would have somewhat less Sub-Saharan and eastern farmer ancestry. But the basic pattern persisted down to the Roman period, and was only shifted by admixture due to slavery.

And not to belabor the point, but a paper from a few years ago which had some Copt samples looks familiar in its broad outlines. You see that the Copts have very little Sub-Saharan African ancestry, though it does seem to be evident (the marker set is in the hundreds of thousands of SNPs). Additionally, they are quite distinct from the Qatari Arab sample.

Unfortunately the data for this paper just published is not on the European Nucleotide Archive. I really want to dig a little deeper into it.

What are the takeaways here? Egypt has been the sink for a lot of migration and gene flow over the past several thousand years, and probably earlier. Not surprising considering that it was relatively wealthy in the aggregate. The Natufian population that the Late Period Egyptians resemble the most did not have Sub-Saharan African ancestry according to earlier research. These Late Period Egyptians do have some. This is reasonable in light of the long interaction with Nubia which is historically attested. Similarly, there was clearly gene flow from Southwest Asia. This is again historically attested, especially in the Nile Delta (though foreign garrisons of mercenaries are recorded in Upper Egypt as well).

The Roman period probably did introduce some gene flow from Southeast Europe and Southwest Asia. But these populations are not that distinct from Egyptians.

Similarly, the Islamic period also brought in different peoples from Arabia and the Caucasus. But the most salient dynamic during the Islamic period was a massive trans-Saharan slave trade (though the Caucasus impact may have been comparable, and I think these results support the proposition that it was).

It seems entirely likely that the Copts are descended from a mix of Roman era Egyptians. Not only do they resemble the people in the Fayum portraits, but the circumstantial genetic data is that they have fewer “exotic” components which increased in frequency during the Islamic era. This would be exactly parallel to ethno-religious minorities in the Levant and Iraq.

One curious element to me is the suggestion gene flow before ~5,000 BCE between Sub-Saharan Africa and the lower Nile valley was low. If it hadn’t been low, it seems unlikely that the fraction of Sub-Saharan ancestry (or shift in that direction in relation to other Eurasians) in Copts would be so small.

So what explains the lack of earlier gene flow? I think the answer is going to be the fact that the human demographic landscape is characterized by lots of local population extinctions. As ancient DNA sampling coverage gets better and better meta-population dynamics are coming into focus, and we see gene flow, and die offs, in several areas. It is fashionable to say that human population variation is characterized by clines. But much of this clinal aspect is an outcome of the period after massive admixture over the last ~10,000 years.

And yet it may not be that the period before the Holocene was not clinal. Rather, it may be that large depopulations of areas of human occupation fragmented clinal ranges, and resulted in new range expansions from “core” zones.

About ~8,000 years ago there was a major desertification period in the Sahara desert. Many trans-Saharan populations may have gone extinct during this time due to rapid climate change. Eventually repopulation may have occurred from outside of the Sahara, so that post-Natufian Levantines and Sub-Saharan Africans from what today call the Sahel pushed up and down the Nile drainage basic respectively, meeting in the zone of Nubia on the boundary of history and prehistory.

Unlike many other areas of the world we have a long attested record of Egyptian history. As we get more mummy samples it seems likely that we’re get a crisper, clearer, picture. And the time transects will not be narrative blind; we already know the general arc of Egyptian history. If, for example, we see a new ancestral component around ~1500 B.C., in Egypt it’s not mysterious what this might be: the Hyksos.

This is just the prologue to a fascinating book that will be written over the next decade.

Related: Blog post analyzing one Copt’s results suggests that Sub-Saharan admixture is more like Dinka than Yoruba (in contrast, Muslim Egyptians have a mix of both, the latter probably coming during the Islamic slave trade, while the former is probably ancient admixture).

Citation: Schuenemann, V. J. et al. Ancient Egyptian mummy genomes suggest an increase of Sub-Saharan African ancestry in post-Roman periods. Nat. Commun. 8, 15694 doi: 10.1038/ncomms15694 (2017).

May 25, 2017

At an inflection point of archaeology and genetics

Filed under: Genetics,History — Razib Khan @ 1:54 pm

People always ask me what to read in relation to the field of historical population genetics. In the 2000s there were a series of books which focused on the mtDNA and Y results from modern phylogeographic analysis. Journey of Man, Seven Daughters of Eve, The Real Eve, and Mapping Human History. But there hasn’t been much equivalent in the 2010s.

Why? I think part of the issue is that the rate of change has been so fast that scholars and journalists haven’t been able to keep up. And, the change is happening right now, so it would likely mean that any book written over a year would be moderately out of date by publication.

I noticed today that Jean Manco has an updated and revised version of her book, Ancestral Journeys: The Peopling of Europe from the First Venturers to the Vikings. This was needed, because the original book was written before some major recent findings, though after some preliminary ones. As Manco has observed herself it was feasible to replace speculations with facts.

Since it seems likely that George R. R. Martin’s next book will be published before David Reich’s, I think that’s all you got. Any suggestions would be welcome.

As for the flip side for history that might be useful to understanding the genetics results, J. M. Roberts The History of the World is the best cliff notes I can think of. It’s obviously a high level survey, but frankly that would improve the interpretation I see in some papers. The fact that much of the history has no contemporary relevance is pretty unimportant, since you want to focus on the older stuff, which is where ancient DNA really shows its metal.

At some point ancient DNA will start to exhibit diminishing returns. Then the long hard slog of interpretation and synthesis will have to begin in earnest.

May 24, 2017

Applying intelligence to genes for intelligence

Filed under: Behavior Genetics,Genetics,Intelligence — Razib Khan @ 12:10 am

Carl Zimmer has an excellent write up on the new new Nature study of the variants associated with IQ, In ‘Enormous Success,’ Scientists Tie 52 Genes to Human Intelligence.

The issue with intelligence is that it’s a highly polygenic trait for which measurement is not always trivial. You need really large sample sizes. It’s about ten times less tractable than height as a quantitative trait. There are still many arguments about its genetic nature (though a majority position that it’s not rare variants of large effect seems to be emerging).

But all in good time.

Science is divided into many different fiefdoms, and people don’t always talk to each other. For example I know a fair number of population genomicists, and I know behavior geneticists who utilize quantitative genomic methods. The two are distinct and disparate groups. But the logic of cheap sequencing and big data is impacting both fields.

Unfortunately when you talk to population genomicists many are not familiar much with psychology, let alone psychometrics. When it comes to the behavior geneticists many come out of psychology backgrounds, so they are not conversant in aspects of genetic theory which harbor no utility for their tasks at hand. This leads to all sorts of problems, especially when journalists go to get comments from researchers who are really opining out of domain.

Some writers, such as Carl Zimmer, are very punctilious about the details. Getting things right. But we have to be cautious, because many journalists prefer a truth-themed story to the truth retold in a story format. And, some journalists are basically propagandists.

Over the next five years you will see many “gene and IQ” studies come out, with progressively greater and greater power. Read the write-ups in The New York TimesScience, and Nature. But to my many readers with technical skills this is what you should really do:

  1. pull down the data.
  2. re-analyze it.

My plain words are this: do not trust, and always verify.

I’m a big fan of people educating themselves on topics which they have opinions on (see: population genetics). If intelligence is of some interest to you, you should read some things. Arthur Jensen’s classic The g Factor: The Science of Mental Ability can be quite spendy (though used copies less so). But Stuart Ritchie’s Intelligence: All That Matters and Richard Haier’s The Neuroscience of Intelligence are both good, and cheaper and shorter. They hit all the basics which educated people should know if they want to talk about the topic of intelligence in an analytical way.

May 18, 2017

To be a scientific intellectual today

Filed under: Career,Genetics — Razib Khan @ 2:41 pm

George Busy has put up note about changes in his career path, Meditation on the Caltrain. I took offense to this section:

On top of this, there was the burgeoning realisation that no one actually reads the academic papers that I write. This is no moot point: writing papers is the main purview of a research scientist, and the central way we both communicate our results and measure success. However, compared to the proportion of the world’s population who can read, the number of people that had sat down to ingest my latest, dense, and fascinating (to me at least) treaty on the population genetics of Africa, three years in the making, was minuscule. The words of a colleague rang in my head: “99.9% of scientific papers just don’t get read”.

His most recent paper, Admixture into and within sub-Saharan Africa, was great. I meant to blog it, but got busy with other things. To be frank the fact that someone like George Busy is having trouble in the academic market is sobering. He has produced good and prominent work, and has been attached to groups which have some prominence. Of course grant approvals and job prospects have a stochastic element. But his experience shows that talent and good work is just a necessary, not sufficient, condition.

It looks like Busby will land in Silicon Valley with one of the two companies that do a lot of work on ancestry. Good for him. I think it does behoove those of us with intellectual pretensions to wonder what we’re doing out in the world. And, it also behooves academics to wonder what they’re doing with their job security. Sometimes it is important to tell the truth and explore topics even if people don’t care, or don’t want to listen. Otherwise, why fund anything that’s not practical with the public fisc?

The misrepresentation of genetic science in the Vox piece on race and IQ

Filed under: Genetics — Razib Khan @ 11:30 am

I don’t have time or inclination to do a detailed analysis of this piece in Vox, Charles Murray is once again peddling junk science about race and IQ. Most people really don’t care about the details, so what’s the point?

But in a long piece one section jumped out to me in particular because it is false:

Murray talks about advances in population genetics as if they have validated modern racial groups. In reality, the racial groups used in the US — white, black, Hispanic, Asian — are such a poor proxy for underlying genetic ancestry that no self-respecting statistical geneticist would undertake a study based only on self-identified racial category as a proxy for genetic ancestry measured from DNA.

Obviously the Census categories are pretty bad and not optimal (e.g., the “Asian American” category pools South with East & Southeast Asians, and that has caused issues in biomedical research in the past). But the claim is false. In the first half of the 2000s the eminent statistical geneticist Neil Risch specifically addressed this issue. From 2002 in Genome Biology Categorization of humans in biomedical research: genes, race and disease:

A debate has arisen regarding the validity of racial/ethnic categories for biomedical and genetic research. Some claim ‘no biological basis for race’ while others advocate a ‘race-neutral’ approach, using genetic clustering rather than self-identified ethnicity for human genetic categorization. We provide an epidemiologic perspective on the issue of human categorization in biomedical and genetic research that strongly supports the continued use of self-identified race and ethnicity.

A major discussion has arisen recently regarding optimal strategies for categorizing humans, especially in the United States, for the purpose of biomedical research, both etiologic and pharmaceutical. Clearly it is important to know whether particular individuals within the population are more susceptible to particular diseases or most likely to benefit from certain therapeutic interventions. The focus of the dialogue has been the relative merit of the concept of ‘race’ or ‘ethnicity’, especially from the genetic perspective. For example, a recent editorial in the New England Journal of Medicine [1] claimed that “race is biologically meaningless” and warned that “instruction in medical genetics should emphasize the fallacy of race as a scientific concept and the dangers inherent in practicing race-based medicine.” In support of this perspective, a recent article in Nature Genetics [2] purported to find that “commonly used ethnic labels are both insufficient and inaccurate representations of inferred genetic clusters.” Furthermore, a supporting editorial in the same issue [3] concluded that “population clusters identified by genotype analysis seem to be more informative than those identified by skin color or self-declaration of ‘race’.” These conclusions seem consistent with the claim that “there is no biological basis for ‘race'” [3] and that “the myth of major genetic differences across ‘races’ is nonetheless worth dismissing with genetic evidence” [4]. Of course, the use of the term “major” leaves the door open for possible differences but a priori limits any potential significance of such differences.

In our view, much of this discussion does not derive from an objective scientific perspective. This is understandable, given both historic and current inequities based on perceived racial or ethnic identities, both in the US and around the world, and the resulting sensitivities in such debates. Nonetheless, we demonstrate here that from both an objective and scientific (genetic and epidemiologic) perspective there is great validity in racial/ethnic self-categorizations, both from the research and public policy points of view.

From a 2005 interview:

Gitschier: Let’s talk about the former, the genetic basis of race. As you know, I went to a session for the press at the ASHG [American Society for Human Genetics] meeting in Toronto, and the first words out of the mouth of the first speaker were “Genome variation research does not support the existence of human races.”

Risch: What is your definition of races? If you define it a certain way, maybe that’s a valid statement. There is obviously still disagreement.

Gitschier: But how can there still be disagreement?

Risch: Scientists always disagree! A lot of the problem is terminology. I’m not even sure what race means, people use it in many different ways.

In our own studies, to avoid coming up with our own definition of race, we tend to use the definition others have employed, for example, the US census definition of race. There is also the concept of the major geographical structuring that exists in human populations—continental divisions—which has led to genetic differentiation. But if you expect absolute precision in any of these definitions, you can undermine any definitional system. Any category you come up with is going to be imperfect, but that doesn’t preclude you from using it or the fact that it has utility.

We talk about the prejudicial aspect of this. If you demand that kind of accuracy, then one could make the same arguments about sex and age!

You’ll like this. In a recent study, when we looked at the correlation between genetic structure [based on microsatellite markers] versus self-description, we found 99.9% concordance between the two. We actually had a higher discordance rate between self-reported sex and markers on the X chromosome! So you could argue that sex is also a problematic category. And there are differences between sex and gender; self-identification may not be correlated with biology perfectly. And there is sexism. And you can talk about age the same way. A person’s chronological age does not correspond perfectly with his biological age for a variety of reasons, both inherited and non-inherited. Perhaps just using someone’s actual birth year is not a very good way of measuring age. Does that mean we should throw it out? No. Also, there is ageism—prejudice related to age in our society. A lot of these arguments, which have a political or social aspect to them, can be made about all categories, not just the race/ethnicity one.

Risch is not obscure. In the piece the author observes that Risch ‘was described by one of the field’s founding fathers [of the field] as “the statistical geneticist of our time.’

2005 is a long way from 2017. Risch may have changed his mind. In fact, it is probably best for him and his reputation if he has changed his mind. I wouldn’t be surprised if Risch comes out and engages in a struggle session where he disavows his copious output from 2005 and earlier defending the utilization of race as a concept in statistical genetics.

Also, genotyping is cheap enough and precise enough that one might actually make an argument for leaving off any self-reported ancestry questions. It’s really not necessary. This isn’t 2005.

But that section on the Vox piece is simply false. Vox is a high profile website which serves to “explain” things to people. The academics who co-wrote that piece are very smart, prominent, and known to me. I don’t plan on asking them why put that section in there. I think I know why.

There will be no update to that piece I’m sure. It will be cited widely. It will become part of what “we” all know. Who I am to disagree with Vox? This is journalism from what have been able to gather and understand. The founders of Vox are rich and famous now. Incentives matter.

As for science and the academy? I am frankly too depressed to say more.

May 17, 2017

The population genetic structure of Sicily and Greece

Filed under: Genetics,Italy,Mediterranean — Razib Khan @ 8:44 am


By total coincidence a paper came out yesterday, Ancient and recent admixture layers in Sicily and Southern Italy trace multiple migration routes along the Mediterranean (I blogged about the topic). It’s open access, and it has a lot of statistics and analyses. I’d recommend you read it yourself.

You see the Sicilian and Greek populations and their skew toward the eastern Mediterranean. But in the supplements they displayed some fineSTRUCTURE clustering, and at K = 3 you see that Europe and the Middle East diverge into three populations. What this is showing seems to be: 1) in red, those groups least impacted by post-Neolithic migration 2) in blue, Middle Eastern groups characterized by the fusion between western & eastern Middle Eastern farmer which occurred after the movement west of the ancestors of the “Early European Farmers” (who gave rise to the red cluster), who were related to the western Middle Eastern farmers 3) the groups most impacted by Pontic steppe migration.

The authors confirm what I reported over two years ago on this blog: mainland and island Greeks are genetically distinct, probably because the former have recent admixture from Slavs and Slav-influenced people. And, many Southern Italians resemble island Greeks.

One has to be careful about dates inferred from genetic patterns. For example:

Significant admixture events successfully dated by ALDER reveal that all Southern Italian and Balkan groups received contributions from populations bearing a Continental European ancestry between 3.0 and 1.5 kya

The beginning of folk wanderings in the Balkans which reshaped its ethnographic landscape really dates to the later 6th century, when the proto-Byzantines began to divert all its resources to the eastern front with Persia, and abandoned the hinterlands beyond the Mediterranean coast in Europe to shift its focus toward the Anatolian core of the empire. The Slavic migrations were such that there were tribes resident in the area of Sparta in the early medieval period. Presumably because they were not a seafaring folk they don’t seem to have had much impact on the islands.

Such an early period in the interval though can not be the Slavs. What can it be? I suspect that that there are signals of Indo-European migrations in there that are being conflated due to low power to detect them since they are rather modest in demographic impact. The islands such as Sardinia, Crete and Cyprus had non-Indo-European speakers down to the Classical period.

Overall it’s an interesting paper. But it needs a deeper dig than I have time right now.

The Orantes has not mixed much with the Tiber

Filed under: Genetics,Italian Genetics — Razib Khan @ 12:21 am


In a moment of weakness I decided to read some of Mary Beard’s SPQR: A History of Ancient Rome. I say weakness because I want to wean myself off of excessive reading of Roman history, as in terms of inferential utility I’ve long reached diminishing returns. But I quite enjoy the topic, and so here I am.

The author is an excellent writer as well as a scholar, and I quite enjoyed Roman Triumph, so I am entirely not surprised that SPQR has me hooked. Some of my correspondents have exhibited some disdain toward it because of Beard’s attempts to draw some connections to present day mores and values from that of Rome, presumably with a progressive bent.

Myself, this does not bother me. I don’t come into reading about Rome as an ignorant, so I can sort that from the nuggets of fact and positivistic interpretation. In any case, I think of it rather like how Islamic philosophers viewed Aristotle through their own religio-cultural lens. Obviously this was an issue that caused resistance to the transmission of Aristotle to the Christian West, but ultimately it did not stop what was inevitable. At the end of the day it was more about Aristotle than the glosses.

Though I highly recommend SPQR (I’m halfway through), that’s not the point of this post. Going along I kept thinking about the section on the Etruscans. The Rasena. Their origins have a genetic connection that is clouded and uncertain right now. I would like to dig deeper into this issue in the future; no doubt some day it will be cleared up. But that day is not this day.

Modern Italians have more “Indo-European” admixture than they do “Middle Eastern”

Rather, I want to address the idea that modern Italians are genetically a distinct people from ancient Roman Italians. Because on that score we have the answers. Ultimately the idea that this is even a debate goes back to Juvenal:

It is that the city is become Greek, Quirites, that I cannot tolerate; and yet how small the proportion even of the dregs of Greece! Syrian Orontes has long since flowed into the Tiber, and brought with it its language, morals, and the crooked harps with the flute-player, and its national tambourines, and girls made to stand for hire at the Circus. Go thither, you who fancy a barbarian harlot with embroidered turban….

These comments are rooted in the reality that Rome during Juvenal’s period was quite a cosmopolitan city, with large numbers of Greeks and people from the Eastern Mediterranean who were Hellenized to various degrees (in the early 3rd century Rome was ruled by a family of Hellenized Syrians). We know this because we have plenty of observations and complaints, and there are a plethora of inscriptions and graffiti in the new languages.

In the 19th and early 20th century the ascendency of Nordic racial theories about the origins of white supremacy across the world presented a problem. The Mediterranean peoples had been in decline for centuries, and were perceived to be Orientalized and inferior. Yet in the past they had achieved greatness which Northern Europeans were attempting to emulate. How could a racially inferior people have created such excellence?

A simple explanation for this condition for Victorians and their Continental fellow travelers was one of racial degradation. The ancient Romans were in this telling fundamentally a different people than modern Romans, with the latter being derived from migrants from the eastern Mediterranean who had arrived during the period of the Empire.

Though most of the racially derogatory elements are gone form this narrative, it is still strongly persistent in public consciousness. Being a Cavalli-Sforza nerd (there is such a thing), I have a copy of Consanguinity, Inbreeding, and Genetic Drift in Italy, and there was data in it which made me skeptical of wholesale replacement in the middle 2000s. Then there was Peter Ralph and Graham Coop’s 2013 paper, The Geography of Recent Genetic Ancestry across Europe, which reported lots of deep regional structure across Italy.

This is important because it suggests a local stability to the demographic character of the regions for a long time. Probably earlier than the period of the Roman Empire. Though one can imagine scenarios of demographic replacement which would produce this result, they’re generally less parsimonious than the model whereby modern Italian population structure maintains the general outline it had at the beginning of the Iron age.

Finally, over the past seven years I have done a lot of analysis and manipulation of tens of thousands of Europeans and Middle Easterners in relation to their genetic data for personal and professional reasons. Some patterns jump out at you, and some subtle tendencies come into the foreground. It is pretty clear that Italians are not a transplanted Middle Eastern population (though there is some recent non-Italian ancestry; Sicilians often have minor components of clear North African ancestry as well as small percentages of Sub-Saharan heritage, which I think is almost certainly due not to Greek and Roman cosmopolitanism, but the legacy of the Arab emirate which existed on the island for a few centuries).

But now I have realized probably the best illustration of this. The Reich lab has been generating a massive genotype dataset over the past five years on the Human Origins Array. And not only do you have modern populations, but you have ancient ones (from ancient DNA). The PCA plots in their papers make what I’m saying above pretty clear.

I’ve modified the PCA plot from Genomic insights into the origin of farming in the ancient Near East. Notice where various Italian groups and Greeks are. I’ve also labeled the Druze; they are almost certainly an excellent representation of Near Eastern Syrians from 2,000 years ago. They have been endogamous for nearly 1,000 years in the Lebanese highlands, and don’t have admixture that is more common in Syrian Muslims from the lowlands.

Notice that the most of the Greeks are shifted further toward Northwestern Europeans than Southern Italians. I say most, because I’ve had access to a larger data set of Greeks, and it becomes clear that a minority of Greeks cluster more with Southern Italians, and the majority have a minority admixture element from a Northern European population. This is Slavic ancestry that arrived after the middle of the 6th century, when the East Roman state basically abandoned most of the Balkans to focus on maintaining control over Constantinople, Salonika, and the Peloponnese.

Northern Italians are shifted toward Sardinians and Spaniards. The Sardinians are important, because we now know that they are the closest modern Europeans to the agriculturalists who arrived from the eastern Mediterranean during the early Neolithic. This population, “Early European Farmers” (EEF), once dominated most of the continent. But ~5,000 years ago migrations from the steppe brought a new element which replaced and assimilated them in Northern Europe.

But in Southern Europe their genetic legacy remains strong and to a great extent dominant. Iberia and the Italian peninsula have been impacted by the migrations out of the steppe, with Sardinia the least so. In the smaller plot above you can see that the early Neolithic individuals are close to the Sardinians, with mainland Italians being shifted toward other populations.

The Northern Italians in particular show some influence from Northern European populations. Some of this may be gene flow through diffusion due to proximity, but the Alps are a rather formidable barrier. Rather, I suspect it reflects episodic migration. I generally do not weight the Lombards too highly as a major influence. Rather, I suspect that it is a combination of Gaulish settlement in the Po river valley, and early impacts from the Indo-Europeans who arrived in the Italian peninsula.

The Southern Italian shift toward the Middle East probably does indicate some gene flow, but it is important to remember that this was also Magna Graecia, so there is probably a Greek element here similar to what occurs among those Greeks without Slavic admixture (please note that Byzantine Greek rule also persisted in Southern Italy up until the Norman conquest ). And if you look at how they relate to the Neolithic samples, they exhibit a lot of shift on the plane toward the steppe populations, parallel to the Levantines. In other words, a lot of the change since the Neolithic in Southern Italy is attributable to the influence of the steppe migration, not Roman era gene flow from Syrians.

I will probably do some formal analysis at some point so that the numbers can get out there now that there are so many ancient genotypes available too. But really this shouldn’t be a discussion anymore.

Addendum: You may be asking, if there are so many literary comments about non-Italians during the Roman Empire in Italy, where did they go? I think the big thing to remember is that there is an ascertainment bias toward what we know in urban areas. There is a high likelihood that urban areas were population sinks, which could not maintain themselves without constant migration.

May 15, 2017

The end of insurance (some if it)

Filed under: Genetics,Insurance — Razib Khan @ 5:58 am

Unless you’ve been sleeping under a rock you are aware that the cost of sequencing has been going down. Less clear to many is that genotyping has also been declining. At last year’s ASHG some physicians were talking about SNP-chips in the range of the low tens of dollars.

Right now most diseases for most people who buy health care are accounted for by the standard issue SNP-chips. If you have a rare mutation not on the chip, or are of a minority ethnic group not well ascertained by the chip, well, tough luck. My point is that chips probably have a near term future.

And that’s what’s at the heart of this piece in The New York Times, New Gene Tests Pose a Threat to Insurers:

So Ms. Reilly, 77, a retired social worker in Ann Arbor, Mich., applied for a long-term care insurance policy. Wary of enrolling people at risk for dementia, the insurance company tested her memory three times before issuing the policy.

But Ms. Reilly knew something the insurer did not: She has inherited the ApoE4 gene, which increases the lifetime risk of developing Alzheimer’s. “I decided I’d best get long-term care insurance,” she said.

I think the headline will mislead many people because when we hear “insurance” in relation to health, we assume health insurance. But long-term care insurance and life insurance are both relevant to health, and both of these have a major issue now with asymmetric information.

Many people are declaring that health insurance is over once everyone gets sequenced. I don’t think that’s necessarily true. The minority of the population that has a highly penetrant Mendelian disease may be in trouble without legal protection. But most disease variance is not going to be due to Mendelian disorders. Rather, people have risks based on family history and polygenic scores and lifestyle. And, a substantial proportion of disease and illness remains and will remain random.

With all that said, it’s not going to be a pretty picture when pockets of the insurance industry collapse. With greater knowledge comes the reckoning that we as a society have to make about the values who hold to be true.

May 11, 2017

When conquered pre-Greece took captive her rude Hellene conqueror

Filed under: Genetics,Genomics,Greece,History,Migration — Razib Khan @ 12:22 am


When I was a child in the 1980s I was captivated by Michael Wood’s documentary In Search of the Trojan War (he also wrote a book with the same name). I had read a fair amount of Greek mythology, prose translations of the Iliad, as well as ancient history. The contrast between the Classical Greeks, the strangeness of their mythology was always something that on the surface of my mind. The reality that Bronze Age Greeks were very different from Classical Greeks resolved this issue to some extent.

Though Classical Greeks were very different from us, to some extent Western civilization began with them, and they are very familiar to us. Rebecca Goldstein’s Plato at the Googleplex was predicated on the thesis that the ancient Greek philosopher had something to tell us, and that if he was alive today he would be a prominent public speaker.

I’m going to dodge the issue of Julian Jaynes’ bicameral mind, and just assert that people of the Bronze Age were fundamentally different from us. And that difference is preserved in aspects of Greek mythology. Though it is fashionable, and correct, to assert that Homer’s world was not that of Mycenaeans, but the barbarian period of the Greek Dark Age, it is not entirely true. Homer clearly preserved traditions where citadels such as Mycenae and Pylos were preeminent, and details such as the boar’s tusk helmets are also present in the Iliad.

But aesthetic details or geopolitics are not what struck me about Greek mythology, but events such as the sacrifice of Iphigenia. Like Abraham’s near sacrifice of his son, this plot element strikes moderns as cruel, barbaric, and unthinking. And though the Classical Greeks did not have our conception of human rights, they had turned against human sacrifice (and the Romans suppressed the practice when they conquered the Celts) on the whole, but it seems to have occurred in earlier periods.

The rupture between the world of the Classical Greeks and the strange edifices of Mycenaean Greece were such that scholars were shocked that the Linear B tablets of the Bronze Age were written in Greek when they were finally deciphered. In fact many of the names and deities on these tablets would be familiar to us today; the name Alexander and the goddess Athena are both attested to in Mycenaean tablets.

Preceding the Mycenaeans, who  emerge in the period between 1400-1600 BCE, are the Minoans, who seem to have developed organically in the Aegean in the 3rd millennium. This culture had relations with Egypt and the Near East, their own system of writing, and deeply influenced the motifs of the successor Mycenaean Greek civilization. The aesthetic similarities between Mycenaeans and Minoans is one reason that many were surprised that the former were Greek, because the Minoan language was likely not.

Mycenaean civilization seems to have been a highly militarized and stratified society. There is a reason that this is sometimes referred to as the “age of citadels.” Allusions to the Greeks, or Achaeans, in the diplomatic missives of the Egyptians and Hittites suggests that the lords of the Hellenes were reaver kings. In 1177 B.C. Eric Cline repeats the contention that a fair portion of the “sea peoples” who ravaged Egypt in the late Bronze Age were actually Greeks.

So when did these Greeks arrive to the shores of Hellas? In The Coming of the Greeks Robert Drews argued that the Greeks were part of a broader movement of mobile charioteers who toppled antique polities and turned them into their own. The Hittites and Mitanni were two examples of Indo-European ruling elites who took over a much more advanced civilizational superstructure and made it their own. While the Hittites and other Indo-Europeans, such as the Luwians and Armenians, slowly absorbed the non-Indo-European substrate of Anatolia, the Indo-Aryan Mitanni elite were linguistically absorbed by their non-Indo-European Hurrian subjects. Indo-Aryan elements persisted only their names, their gods, and tellingly, in a treatise on training horses for charioteers.

Drews’ thesis is that the Greek language percolated down from the warlords of the citadels and their retinues over the Bronze Age, with the relics who did not speak Greek persisting into the Classical period as the Pelasgians. Set against this is the thesis of Colin Renfrew that Greece was one of the first Indo-European languages, as Indo-European languages began in Anatolia.

The most recent genetic data suggest to me that both theses are likely to be wrong. The data are presented in two preprints The Population Genomics Of Archaeological Transition In West Iberia and The Genomic History Of Southeastern Europe. The two papers cover lots of different topics. But I want to focus on one aspect: gene flow from steppe populations into Southern Europe.

We know that in the centuries after 2900 BCE there was a massive eruption of individuals from the steppe fringe of Eastern Europe, and Northern Europe from Ireland to to Poland was genetically transformed. Though there was some assimilation of indigenous elements, it looks to be that the majority element in Northern Europe were descended from migrants.

For various reasons this was always less plausible for Southern Europe. The first reason is that Southern Europeans shared a lot of genetic similarities to Sardinians, who resembled Neolithic farmers. Admixture models generally suggested that in the peninsulas of Southern Europe the steppe-like ancestry was the minority component, not the majority, as was the case in Northern Europe.

These data confirm it. The Bronze Age in Portugal saw a shift toward steppe-inflected populations, but it was not a large shift. There seems to have been later gene flow too. But by and large the Iberian populations exhibit some continuity with late Neolithic populations.  This is not the case in Northern Europe.

In The Genomic History Of Southeastern Europe the authors note that steppe-like ancestry could be found sporadically during early periods, but that there was a notable increase in the Bronze Age, and later individuals in the Bronze Age had a higher fraction. Nevertheless, by and large it looks as if the steppe-like gene flow in the southerly Balkans (focusing on Bulgarian samples) was modest in comparison to the northern regions of Europe. Unfortunately I do not see any Greece Bronze Age samples, but it seems likely that steppe-like influence came into these groups after they arrived in Bulgaria, which is more northerly.

Down to the present day a non-Indo-European language, Basque, is spoken in Spain. Paleo-Sardinian survived down to the Classical period, and it too was not Indo-European. Similarly, non-Indo-European Pelasgian communities continued down to the period of city-states in Greece.

These long periods of coexistence point to the demographic equality (or even superiority) of the non-Indo-European populations. The dry climate of the Mediterranean peninsulas are not as suitable for cattle based agro-pastoralism. This may have limited the spread and dominance of Indo-Europeans. Additionally, the Mediterranean peninsulas were likely touched by Indo-European migrations relatively late. Much of the early zeal for expansion may have already dissipated by them. The high frequency of likely Indo-European R1b lineages among the Basques is curious, and may point to the spreading of male patronization networks, and their assimilation into non-Indo-European substrates where necessary. R1b is also found in Sardinia, and in high frequencies in much of Italy.

The interaction and synthesis between native and newcomer was likely intensive in the Mediterranean. For example, of the gods of the Greek pantheon only Zeus is indubitably of Indo-European origin. Some, such as Artemis, have clear Near Eastern antecedents. But other Greek gods may come down from the pre-Greek inhabitants of what became Greece.

Ultimately these copious interactions and transformations should not be a great surprise. The sunny lands of the Mediterranean attracted Northern European tribes during Classical antiquity. The Cimbri invasion of Italy, Galatians in Thrace and Anatolia, the folk wandering of Vandals and Goths into Iberia, are all instances of population movements southward. These likely moved the needle ever so slightly toward convergence between Northern and Southern Europe in terms of genetic content.

In relation to the more general spread of Indo-Europeans, I believe there are a few areas like Northern Europe, where replacement was preponderant (e.g., the Tarim basin). But I also believe there were many more which presented a Southern European model of synthesis and accommodation.

When conquered pre-Greece took captive her rude Hellene conqueror

Filed under: Genetics,Genomics,Greece,History,Migration — Razib Khan @ 12:22 am


When I was a child in the 1980s I was captivated by Michael Wood’s documentary In Search of the Trojan War (he also wrote a book with the same name). I had read a fair amount of Greek mythology, prose translations of the Iliad, as well as ancient history. The contrast between the Classical Greeks, the strangeness of their mythology was always something that on the surface of my mind. The reality that Bronze Age Greeks were very different from Classical Greeks resolved this issue to some extent.

Though Classical Greeks were very different from us, to some extent Western civilization began with them, and they are very familiar to us. Rebecca Goldstein’s Plato at the Googleplex was predicated on the thesis that the ancient Greek philosopher had something to tell us, and that if he was alive today he would be a prominent public speaker.

I’m going to dodge the issue of Julian Jaynes’ bicameral mind, and just assert that people of the Bronze Age were fundamentally different from us. And that difference is preserved in aspects of Greek mythology. Though it is fashionable, and correct, to assert that Homer’s world was not that of Mycenaeans, but the barbarian period of the Greek Dark Age, it is not entirely true. Homer clearly preserved traditions where citadels such as Mycenae and Pylos were preeminent, and details such as the boar’s tusk helmets are also present in the Iliad.

But aesthetic details or geopolitics are not what struck me about Greek mythology, but events such as the sacrifice of Iphigenia. Like Abraham’s near sacrifice of his son, this plot element strikes moderns as cruel, barbaric, and unthinking. And though the Classical Greeks did not have our conception of human rights, they had turned against human sacrifice (and the Romans suppressed the practice when they conquered the Celts) on the whole, but it seems to have occurred in earlier periods.

The rupture between the world of the Classical Greeks and the strange edifices of Mycenaean Greece were such that scholars were shocked that the Linear B tablets of the Bronze Age were written in Greek when they were finally deciphered. In fact many of the names and deities on these tablets would be familiar to us today; the name Alexander and the goddess Athena are both attested to in Mycenaean tablets.

Preceding the Mycenaeans, who  emerge in the period between 1400-1600 BCE, are the Minoans, who seem to have developed organically in the Aegean in the 3rd millennium. This culture had relations with Egypt and the Near East, their own system of writing, and deeply influenced the motifs of the successor Mycenaean Greek civilization. The aesthetic similarities between Mycenaeans and Minoans is one reason that many were surprised that the former were Greek, because the Minoan language was likely not.

Mycenaean civilization seems to have been a highly militarized and stratified society. There is a reason that this is sometimes referred to as the “age of citadels.” Allusions to the Greeks, or Achaeans, in the diplomatic missives of the Egyptians and Hittites suggests that the lords of the Hellenes were reaver kings. In 1177 B.C. Eric Cline repeats the contention that a fair portion of the “sea peoples” who ravaged Egypt in the late Bronze Age were actually Greeks.

So when did these Greeks arrive to the shores of Hellas? In The Coming of the Greeks Robert Drews argued that the Greeks were part of a broader movement of mobile charioteers who toppled antique polities and turned them into their own. The Hittites and Mitanni were two examples of Indo-European ruling elites who took over a much more advanced civilizational superstructure and made it their own. While the Hittites and other Indo-Europeans, such as the Luwians and Armenians, slowly absorbed the non-Indo-European substrate of Anatolia, the Indo-Aryan Mitanni elite were linguistically absorbed by their non-Indo-European Hurrian subjects. Indo-Aryan elements persisted only their names, their gods, and tellingly, in a treatise on training horses for charioteers.

Drews’ thesis is that the Greek language percolated down from the warlords of the citadels and their retinues over the Bronze Age, with the relics who did not speak Greek persisting into the Classical period as the Pelasgians. Set against this is the thesis of Colin Renfrew that Greece was one of the first Indo-European languages, as Indo-European languages began in Anatolia.

The most recent genetic data suggest to me that both theses are likely to be wrong. The data are presented in two preprints The Population Genomics Of Archaeological Transition In West Iberia and The Genomic History Of Southeastern Europe. The two papers cover lots of different topics. But I want to focus on one aspect: gene flow from steppe populations into Southern Europe.

We know that in the centuries after 2900 BCE there was a massive eruption of individuals from the steppe fringe of Eastern Europe, and Northern Europe from Ireland to to Poland was genetically transformed. Though there was some assimilation of indigenous elements, it looks to be that the majority element in Northern Europe were descended from migrants.

For various reasons this was always less plausible for Southern Europe. The first reason is that Southern Europeans shared a lot of genetic similarities to Sardinians, who resembled Neolithic farmers. Admixture models generally suggested that in the peninsulas of Southern Europe the steppe-like ancestry was the minority component, not the majority, as was the case in Northern Europe.

These data confirm it. The Bronze Age in Portugal saw a shift toward steppe-inflected populations, but it was not a large shift. There seems to have been later gene flow too. But by and large the Iberian populations exhibit some continuity with late Neolithic populations.  This is not the case in Northern Europe.

In The Genomic History Of Southeastern Europe the authors note that steppe-like ancestry could be found sporadically during early periods, but that there was a notable increase in the Bronze Age, and later individuals in the Bronze Age had a higher fraction. Nevertheless, by and large it looks as if the steppe-like gene flow in the southerly Balkans (focusing on Bulgarian samples) was modest in comparison to the northern regions of Europe. Unfortunately I do not see any Greece Bronze Age samples, but it seems likely that steppe-like influence came into these groups after they arrived in Bulgaria, which is more northerly.

Down to the present day a non-Indo-European language, Basque, is spoken in Spain. Paleo-Sardinian survived down to the Classical period, and it too was not Indo-European. Similarly, non-Indo-European Pelasgian communities continued down to the period of city-states in Greece.

These long periods of coexistence point to the demographic equality (or even superiority) of the non-Indo-European populations. The dry climate of the Mediterranean peninsulas are not as suitable for cattle based agro-pastoralism. This may have limited the spread and dominance of Indo-Europeans. Additionally, the Mediterranean peninsulas were likely touched by Indo-European migrations relatively late. Much of the early zeal for expansion may have already dissipated by them. The high frequency of likely Indo-European R1b lineages among the Basques is curious, and may point to the spreading of male patronization networks, and their assimilation into non-Indo-European substrates where necessary. R1b is also found in Sardinia, and in high frequencies in much of Italy.

The interaction and synthesis between native and newcomer was likely intensive in the Mediterranean. For example, of the gods of the Greek pantheon only Zeus is indubitably of Indo-European origin. Some, such as Artemis, have clear Near Eastern antecedents. But other Greek gods may come down from the pre-Greek inhabitants of what became Greece.

Ultimately these copious interactions and transformations should not be a great surprise. The sunny lands of the Mediterranean attracted Northern European tribes during Classical antiquity. The Cimbri invasion of Italy, Galatians in Thrace and Anatolia, the folk wandering of Vandals and Goths into Iberia, are all instances of population movements southward. These likely moved the needle ever so slightly toward convergence between Northern and Southern Europe in terms of genetic content.

In relation to the more general spread of Indo-Europeans, I believe there are a few areas like Northern Europe, where replacement was preponderant (e.g., the Tarim basin). But I also believe there were many more which presented a Southern European model of synthesis and accommodation.

May 9, 2017

The Beaker is breaking!

The link is up, The Beaker Phenomenon And The Genomic Transformation Of Northwest Europe, but the paper is still processing:

I’ll update the post when I can read the paper.

May 6, 2017

Synergistic epistasis as a solution for human existence

Filed under: epistasis,Evolution,Evolutionary Genetics,Genetics,Genomics — Razib Khan @ 12:16 am

Epistasis is one of those terms in biology which has multiple meanings, to the point that even biologists can get turned around (see this 2008 review, Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems, for a little background). Most generically epistasis is the interaction of genes in terms of producing an outcome. But historically its meaning is derived from the fact that early geneticists noticed that crosses between individuals segregating for a Mendelian characteristic (e.g., smooth vs. curly peas) produced results conditional on the genotype of a secondary locus.

Molecular biologists tend to focus on a classical, and often mechanistic view, whereby epistasis can be conceptualized as biophysical interactions across loci. But population geneticists utilize a statistical or evolutionary definition, where epistasis describes the extend of deviation from additivity and linearity, with the “phenotype” often being fitness. This goes back to early debates between R. A. Fisher and Sewall Wright. Fisher believed that in the long run epistasis was not particularly important. Wright eventually put epistasis at the heart of his enigmatic shifting balance theory, though according to Will Provine in Sewall Wright and Evolutionary Biology even he had a difficult time understanding the model he was proposing (e.g., Wright couldn’t remember what the different axes on his charts actually meant all the time).

These different definitions can cause problems for students. A few years ago I was a teaching assistant for a genetics course, and the professor, a molecular biologist asked a question about epistasis. The only answer on the key was predicated on a classical/mechanistic understanding. But some of the students were obviously giving the definition from an evolutionary perspective! (e.g., they were bringing up non-additivity and fitness) Luckily I noticed this early on and the professor approved the alternative answer, so that graders would not mark those using a non-molecular answer down.

My interested in epistasis was fed to a great extent in the middle 2000s by my reading of Epistasis and the Evolutionary Process. Unfortunately not too many people read this book. I believe this is so because when I just went to look at the Amazon page it told me that “Customers who viewed this item also viewed” Robert Drews’ The End of the Bronze Age. As it happened I read this book at about the same time as Epistasis and the Evolutionary Process…and to my knowledge I’m the only person who has a very deep interest in statistical epistasis and Mycenaean Greece (if there is someone else out there, do tell).

In any case, when I was first focused on this topic genomics was in its infancy. Papers with 50,000 SNPs in humans were all the rage, and the HapMap paper had literally just been published. A lot has changed.

So I was interested to see this come out in Science, Negative selection in humans and fruit flies involves synergistic epistasis (preprint version). Since the authors are looking at humans and Drosophila and because it’s 2017 I assumed that genomic methods would loom large, and they do.

And as always on the first read through some of the terminology got confusing (various types of statistical epistasis keep getting renamed every few years it seems to me, and it’s hard to keep track of everything). So I went to Google. And because it’s 2017 a citation of the paper and further elucidation popped up in Google Books in Crumbling Genome: The Impact of Deleterious Mutations on Humans. Weirdly, or not, the book has not been published yet. Since the author is the second to last author on the above paper it makes sense that it would be cited in any case.

So what’s happening in this paper? Basically they are looking to reduced variance of really bad mutations because a particular type of epistasis amplifies their deleterious impact (fitness is almost always really hard to measure, so you want to look at proxy variables).

Because de novo mutations are rare, they estimate about 7 are in functional regions of the genome (I think this may be high actually), and that the distribution should be Poisson. This distribution just tells you that the mean number of mutations and the variance of the the number of mutations should be the same (e.g., mean should be 5 and variance should 5).

Epistasis refers (usually) to interactions across loci. That is, different genes at different locations in the genome. Synergistic epistasis means that the total cumulative fitness after each successive mutation drops faster than the sum of the negative impact of each mutation. In other words, the negative impact is greater than the sum of its parts. In contrast, antagonistic epistasis produces a situation where new mutations on the tail of the distributions cause a lower decrement in fitness than you’d expect through the sum of its parts (diminishing returns on mutational load when it comes to fitness decrements).

These two dynamics have an effect the linkage disequilibrium (LD) statistic. This measures the association of two different alleles at two different loci. When populations are recently admixed (e.g., Brazilians) you have a lot of LD because racial ancestry results in lots of distinctive alleles being associated with each other across genomic segments in haplotypes. It takes many generations for recombination to break apart these associations so that allelic state at one locus can’t be used to predict the odds of the state at what was an associated locus. What synergistic epistasis does is disassociate deleterious mutations. In contrast, antagonistic epistasis results in increased association of deleterious mutations.

Why? Because of selection. If a greater number of mutations means huge fitness hits, then there will strong selection against individuals who randomly segregate out with higher mutational loads. This means that the variance of the mutational load is going to lower than the value of the mean.

How do they figure out mutational load? They focus on the distribution of LoF mutations. These are extremely deleterious mutations which are the most likely to be a major problem for function and therefore a huge fitness hit. What they found was that the distribution of LoF mutations exhibited a variance which was 90-95% of a null Poisson distribution. In other words, there was stronger selection against high mutation counts, as one would predict due to synergistic epistasis.

They conclude:

Thus, the average human should carry at least seven de novo deleterious mutations. If natural selection acts on each mutation independently, the resulting mutation load and loss in average fitness are inconsistent with the existence of the human population (1 − e−7 > 0.99). To resolve this paradox, it is sufficient to assume that the fitness landscape is flat only outside the zone where all the genotypes actually present are contained, so that selection within the population proceeds as if epistasis were absent (20, 25). However, our findings suggest that synergistic epistasis affects even the part of the fitness landscape that corresponds to genotypes that are actually present in the population.

Overall this is fascinating, because evolutionary genetic questions which were still theoretical a little over ten years ago are now being explored with genomic methods. This is part of why I say genomics did not fundamentally revolutionize how we understand evolution. There were plenty of models and theories. Now we are testing them extremely robustly and thoroughly.

Addendum: Reading this paper reinforces to me how difficult it is to keep up with the literature, and how important it is to know the literature in a very narrow area to get the most out of a paper. Really the citations are essential reading for someone like me who just “drops” into a topic after a long time away….

Citation: ScienceNegative selection in humans and fruit flies involves synergistic epistasis.

May 4, 2017

Africa’s great demographic transformation

Filed under: Bantu Expansion,Genetics,Genomics — Razib Khan @ 9:56 pm

Stonehenge has been a preoccupation for moderns since the Victorian period. It was built over 5,000 years ago, and its usage in some fashion continued down to about 2,500 years ago. For a long while it had been associated with the Celts, but more recently there has been some suspicion that its roots must be pre-Celtic.

And that is almost certainly true. The original site of Stonehenge had a wooden structure. But during the arrival of the Bell Beaker culture it was extensively rebuilt, and eventually stone monoliths were erected in the fashion we are used to seeing today.

Bernard Cornwell’s novel Stonehenge deals with this period. There is no major focus on physical conflict between the native populations, and the Bell Beaker groups. Rather, the plot centers around the cultural tumult and innovation that was triggered by the arrival of the newcomers.

In Stonehenge the Bell Beakers occupied more marginal, out of the way, territory. The novel presumed that ultimately there would be cultural fusion between the two groups, as there was a lot of interaction inter-personally among the characters of the two groups. We now know that the reality was likely one of near total replacement. From the abstract to be presented on shortly on the Bell Beakers:

British individuals associated with Beakers are genetically indistinguishable from continental individuals associated with the same material culture and genetically nearly completely discontinuous with the previously resident population.

This is not entirely surprising. Ancient Ireland seems to have been characterized by discontinuity with the arrival of Bell Beakers genetically.

Ancient DNA is not magic. But it can literally put some flesh on the bones of cultural shifts that archaeologists have seen in the material culture. One key element here is that the predominant ancestry across the British Isles today derives from migrations that date to the early Bronze Age.* I do not know if this has any relevance as to the arrival of the Celtic languages to the Britain and Ireland, but I suspect it does.

This was percolating in my mind because there’s a new paper which attempts to explore in more detail the Bantu expansions which occurred between 1000 BCE and 500 CE. It’s pretty incredible that from Gabon to Capetown Africans speak one language family, with similarities at least as close as that of the Romance language family.

But then is it incredible? Indo-European languages span the North Sea to the Bay of Bengal. The Bantu expansion in some ways serves as a template for the argument in First Farmers, as an agricultural revolution triggered a demographic expansion which did not stop until they reached the their geographic limits.

The paper in Science, which is open access, Dispersals and genetic adaptation of Bantu-speaking populations in Africa and North America, focuses on two issues. First, the demographic history and phylogenomics of the Bantu populations. Second, using population genomic methods it explores the dynamics of natural selection in these peoples. They utilize and extensive SNP data set, with more than 500,000 markers in their core analyses.

In general I think there are lots of interesting results in this paper. But the one angle I was unsatisfied by was their purported increase in coverage. As you can see it’s highly localized to a few countries. This is probably common sense since much of Africa is not accessible due to political issues (e.g., sampling in the Democratic Republic of Congo is treacherous right now). But one always has to be careful of the limitations of the data when making inferences. Though they have samples from the southwest (Angola, Namibia), the the African Great Lakes region around Uganda, and in South Africa, huge zones between are missing. And, they are highly over sampled in and around Gabon.

With all that said, I think with a variety of methods they probably have confirmed a major aspect of Bantu migration. I’ll quote:

Two hypotheses have been proposed concerning the dispersal of Bantu-speaking populations across sub-Saharan Africa (2–4). According to the “early-split” hypothesis, the western and eastern branches split early, within the Bantu heartland, into separate migration routes. By contrast, the “late-split” model suggests an initial spread southward from the Bantu homeland into the equatorial rainforest (i.e., Gabon/Angola), followed by expansions toward the rest of the subcontinent. We tested these hypotheses by determining whether eBSPs and seBSPs were genetically closer to wBSPs from the southern part, relative to wBSPs from the northern part, of western central Africa….

…Although additional sampling of African populations may further refine these patterns, our results, together with previous genetic data supporting the late-split model (2, 3), indicate that BSPs [Bantu-speaking peoples] first moved southward through the rainforest before migrating toward eastern and southern Africa, where they admixed with local populations. This model is further supported by linguistics (15) and archaeoclimate data (16), suggesting that a climatic crisis ~2500 years ago fragmented the rainforest into patches and facilitated the early movements of BSPs farther southward from their original homeland.

That being said, their sample limitations produce interesting assertions. E.g., “The GLOBETROTTER method estimated that eBSPs resulted from two consecutive admixture events (P < 0.05) occurring 1000 to 1500 years ago and 150 to 400 years ago between a wBSP (~75% contribution) and an Afroasiatic-speaking population from Ethiopia (~10% contribution).” GLOBETROTTER is powerful, but too often people use it in a manner where they assume that the inferences it generates from the data it has are the truth, as opposed to the closest GLOBETROTTER can get to the truth with the tools its given.

In this case I would contend that because there aren’t any Nilotic samples it leaves a major hole in their power to be able to accurately infer what really happened. The presence of pastoralist Nilotic people in close proximity to Bantu agriculturalists has been one of the major dynamics which define the East African landscape. The admixture into eastern Bantu agriculturalists therefore is almost certainly from Nilotic peoples, though there has been Afro-Asiatic (Cushitic) influence as far south as Tanzania, evident in enigmatic peoples such as the Sandawe.

The point here is that just because the GLOBETROTTER method inferred gene flow from a population in the sample set, it does not mean that the gene flow was necessarily from that population. The sampling of the region is sparse, so obviously this is only a first approximation. To some extent I assume the authors assume the readers will connect the dots, but often this sort of thing gets lost in translation, and then it gets into the media….

Though it is difficult to make in the admixture plot above, there are subtle differences in the eastern Bantu groups. The Luyha, who are from Kenya, do not show evidence of the blue component which is clearly Eurasian, while the Bakiga from Rwanda do. But even in the Bakiga the ratio of the violet element that seems to be associated with an indigenous African component which is distinct from that of the Bantu and the blue Eurasian is far higher than in the Afro-Asiatic populations in their data set (this does not mean they don’t have Eurasian ancestry, since admixture plots aren’t perfect proxies).

Because of the nature of the sampling and the utilization of admixture to frame their results I do feel that we don’t get a good sense of the variation among the Bantu across their full range. Granted, the between population genetic distance is actually quite low across this zone, on the order of 0.01, because of the recent shared ancestry. Africans may have much greater total diversity than Eurasians in their genomes, but their between population distance is actually not much different or even lower than Eurasians because of the recent demographic expansions. But did the Bantu expand into empty lands? The Khoisan, Pygmy and Nilotic (I’m sure that’s what it is) contribution to the Bantus across their range is clear, but that’s because we have close enough reference populations to model this contribution. What about areas like Tanzania? Or Mozambique? Were they empty? I suspect the issue here is that we don’t have any non-Bantu indigenous groups as they’ve all been absorbed.

But it is in the selection component that they offer a possible way to ascertain non-Bantu ancestry from ghost populations in the future. They found lots and lots of selection around immune genes. This is not surprising. There were local diseases which they had to adapt to. Therefore, “the HLA region in wBSPs showed a strong excess of ancestry from rainforest hunter-gatherers, at 38%, 6.74 SD higher than the genome-wide average of 16%…..”

In places like Mozambique it would be curious if the regions known to be under selection or enriched for indigenous ancestry in other areas where there are still indigenous populations exhibited a higher Fst against other groups. That is, the Mozambique ghost populations should leave an inordinate impact on regions of the genome associated with immunological function.

Which brings me back to Stonehenge. We do have ancient genomes. But not that many. Especially further back. Apparently the names of rivers and mountains often have very deep histories. For example, the river Humber has a name which may date back to pre-Celtic times (consider the Mississippi river, which has an American Indian origin). These serve as shadows of cultures long gone and replaced. The Bantu expansion is close enough to the margins of history that we don’t have so much time interposed between it and concrete records. We can skein out its outlines with more rigor and surety. And the patterns we see among the Bantus can give us a sense of how past demographic-cultural expansions may have occurred.

* The papers coming out of the PoBI project suggest that a significant minority of the ancestry in eastern England is Anglo-Saxon. But only there.

Addendum: I can’t find the data to download and test some things myself.

The coming of the Milesians: abstract of “The Bell Beaker Paper” (tBBp)

Filed under: Ancient Europe,Bell Beakers,Genetics,History — Razib Khan @ 10:18 am

I get asked about this all the time, and promised I’d post something first I heard anything, so here is a foretaste, Western Europe during the third millennium BCE: A genetic characterization of the Bell Beaker
Complex
:

The Bell Beaker Complex (BBC) was the first widely distributed archaeological phenomenon of western Europe, arising after 2800 BCE probably in Iberia and spreading to the north and east before disappearing at the latest by 1800 BCE. An open question is the extent to which the cultural elements associated with the BBC spread through movement of ideas or people. We present new genome-wide DNA data from 196 Neolithic and Bronze Age Europeans – the largest report of genome-wide data in a single study to date – and merge it with published data to form a dataset with 109 BBC individuals that provides a genomic characterization of the BBC across its geographic and temporal range. In contrast to people of the Corded Ware Complex who were partly contemporaries of the BBC in central and eastern Europe and who brought steppe ancestry into central Europe through mass migration and replacement of local populations, we show that the initial spread of the BBC into central Europe from the Iberian Peninsula was not mediated by a large-scale migration but rather through communication of ideas. However, the further spread of the BBC beyond central Europe did involve mass movement of people. Focusing on Britain, which includes 81 of our new samples in a time transect from 3900-1300 BCE, we show that the arrival of the BBC around 2400 BCE was mediated by migration from the continent: British individuals associated with Beakers are genetically indistinguishable from continental individuals associated with the same material culture and genetically nearly completely discontinuous with the previously resident population. Such discontinuity persists through to samples from the Bronze Age, documenting a demographic turnover at the onset of the Bronze Age that was crucial to understand the formation of the present-day British gene pool. The arrival of the BBC in Britain can thus be viewed as the western continuation of the massive movement of people that brought the Corded Ware Complex and steppe ancestry into central Europe a few hundred years before.

Ancient DNA has revolutionized our understanding of the history of the past. In a fundamental manner many archaeologists were wrong in assuming that the dominant dynamic of the spread of culture was that of the diffusion of ideas, as opposed to the movement of peoples. But to interpret these results it is clear that archaeological knowledge must be brought to bear, albeit updated with knew prior assumptions.

It would not be entirely surprising if the originators of a cultural complex transmitted it to another group, and then that culture “hitchhiked” on the demographic expansion of the receiving group. A good example would be Roman Catholic Christianity. The Iberians spread it to the New World, along with substantial demographic movement. But the religion itself did not spread to Iberia through migration, but rather cultural shift.

Older Posts »

Powered by WordPress