Razib Khan One-stop-shopping for all of my content

April 10, 2018

European hunter-gatherers were mostly replaced, but not totally. And they were neither black nor white

Filed under: European genetics,European Genomics — Razib Khan @ 8:22 pm

Peter Frost over at his blog has a long post on the transition to agriculture and pastoralism in Northern Europe.

He tagged me on Twitter, so presumably, he’s soliciting my opinion/response.

The post starts off with a quick reference to the attempt to leverage massive replacement in Northern Europe eight to four thousand years ago in the interests of contemporary politics. I’m not going to address that because I’m not very interested in how these topics relate, and I won’t post comments (or will delete) that engage with that. I will focus on the science.

First, I tried to leave a comment on his weblog and blogger ate it. So I’m just going to put a post here in the interests of open exchange. I also think many readers here have some of the same opinions as Peter, or suspicions, so it might be best to clear things up.

I don’t think his Peter’s argument can really be understood without reading his 2006 paper, European hair and eye color: A case of frequency-dependent sexual selection? My opinion in regards to this hypothesis is that I think it’s probably wrong and I’m skeptical. More skeptical than I was when I first read the paper because we have more understanding of the process of the settlement of Europe during the late Pleistocene and early Holocene. But, there is still a small window for it to be correct, as one can see in Peter’s post.

The argument hinges a lot on the pigmentation profiles of proto-European groups based on predictions from algorithms which use modern Europeans as a training set. These predictions are in the papers themselves, so Peter isn’t doing anything that the authors didn’t do. But, I have come to the conclusion that they’re probably not trustworthy. These ancient populations were very different from modern Europeans, and their genetic architecture for pigmentation may have been different (modern Europeans are a compound of several groups).

Though Mesolithic Western European hunter-gatherers were probably darker in complexion than modern Europeans, I believe it is likely that they were not nearly as dark as pigmentation prediction algorithms suggest. Second, it is true that alleles correlated with blonde hair in Europeans within the KITLG locus are found in Siberia nearly 20,000 years ago. But it is not true that “Ancient DNA from Afontova Gora has shown that people had blond hair in mid-Siberia as early as 18,000 years ago.”

What has been found is that Europeans who carry the derived variant at rs12821256 are more likely to have blonde hair. Those who are heterozygote are twice as likely, while homozygotes are four times as likely. At least against the population base rate. The frequency in Scandinavia of the derived variant is ~20%. Many blonde people don’t have the derived variant. And, not all people who have the derived variant have blonde hair.

Of my three children two are heterozygotes for the derived variant (they carry one copy). Probably not coincidently these two have lighter hair than the third. But neither are really blonde, though perhaps they are blond(ish) during certain times of the year. More accurately their hair is probably sandy brown. Why? I’m their father, and as a normally complected South Asia, I give them a host of alleles at other loci which make them different from the typical European genetic architecture of pigmentation.

As I said earlier Peter can’t really be blamed for making these inferences because they are in the scientific literature themselves. But just because they’re there doesn’t make them true (though I do think Peter should be careful about extrapolating from odds ratios against a particular base rate probability to some deterministic relationship).

A final issue is the idea that the alleles that define modern Northern European pigmentation were present in Scandinavian and Eastern European hunter-gatherers. This is correct. But again, modern prediction algorithms are trained groups with modern genetic backgrounds. In mixed populations, the largest effect QTLs explain only half the variance in pigmentation. The rest of it is accounted for by “genomic ancestry”, which basically means there are loci associated with ancestral groups that haven’t been discovered yet. But a second and more important issue is that the frequency of some the alleles in modern Northern European groups is different from what you find in the ancient ones. The ancestral variant on SLC24A5 is almost impossible to find in Northern Europe in indigenous people today (in Europeans the ancestral variant is most often found in Spain, due to admixture with Africa during the Moorish period). I don’t need to review the literature, but there is evidence for a fair amount of selection on these loci within the last 4,000 years. Even SHG and EHG still segregated ancestral variants at higher frequencies that modern Europeans.

The second major theme in the blog post has to do with hunter-gatherer ancestry. There’s a section on haplogroup U where Peter suggests that its disappearance is due to selection, not a replacement. U is associated with hunter-gatherer ancestry. This may be true, but mtDNA and Y need to be interpreted cautiously in any case (both R1b and R1a are far more common than one might predict from autosomal distributions of the ancestry of populations in which they were originally found).

Then there is the argument that bottlenecks/founder effects and natural selection might have skewed our estimates. I don’t really get the former argument at all:

Founder effects may be another causal factor. When bands of hunter-gatherers are given the opportunity to adopt farming, most of them turn up their noses and only a few will make the change. Because those few bands are not perfectly representative of the hunter-gatherer gene pool, and because their numbers may increase many times over (thanks to the increase in food supply) the resulting founder effects will be substantial.

These are verbal models, and unpersuasive to anyone who has looked at the data and generated results. Mesolithic hunter-gatherers were a genetically homogeneous lot to begin with. They didn’t have all this variance to sample from. There was later increase in hunter-gatherer ancestry into European farmers from demographic reservoirs, but the argument about founder effect doesn’t work because the two groups are so different that playing around with biasing the sample from which one mixes does not change the overall result. Replace hunter-gatherer and farmer with “Ashkenazi Jew” and “Chinese.” The latter two groups have some variance, but a bottleneck on one isn’t going to change one’s estimate of admixture in a daughter population.

The issue about selection suffers from the problem that the magnitude would have to be too large and extensive across the whole genome to reshape hunter-gatherers in this manner to be plausible. One might imagine a case where gene flow and selection on parts of the genome from the donor group inflates the donor group proportion…but I don’t think that’s Peter’s point? Theoretically, a model of admixture followed by sweeps around one population’s ancestry component is possible, but I don’t think we see evidence of that in the ancient DNA.

In any case, though the verbal argument seems reasonable on first blush, the models and dynamics don’t work out.

Peter ends:

Some of the confusion in this debate may arise from the assumption that “late hunter-gatherers” formed a single group in Europe. In fact, there were at least three such groups (WHGs, SHGs, EHGs), whose genetic profiles significantly differed from each other and whose fates were likewise different. WHGs were an evolutionary dead end. They were replaced. The same cannot be said for the hunter-fisher-gatherers of Scandinavia and the Baltic, who were able to achieve high population densities by exploiting marine resources (Price 1991). With them we see more genetic continuity than rupture, and it is possible that some genetic characteristics formerly ascribed solely to “Anatolian” farmers were in fact of SHG origin.

The people who are making the assertions that Peter is rebutting are not confused as to the nature of the populations which they named and which they modeled. Peter can download the data and replicate the analyses himself. WHG, SHG and EHG seem to exist on some sort of continuum, with post-“Villabruna cluster” ancestry at one end of the spectrum and post-Ancestral North Eurasian (ANE) ancestry at the other. WHG is mostly descended from ancestors of the Villabruna cluster, who share a common ancestry derived from late Pleistocene West Eurasians with Anatolian farmers (the latter of whom admixed with Basal Eurasians). EHG is a mix of the same Villabruna people (or at least their eastern fringe), but with a preponderance of ANE-like ancestry. SHG is between these two groups.

It also seems that European hunter-gatherers sometime in the late Pleistocene and or early Holocene recieved a small but detectable pulse of East Asian ancestry. Also, commonly shared haplotypes with West Asians on SLC24A5 (SHG and EHG) and EDAR with East Asians (SHG) indicates some gene flow with other places (though I believe SHG has no detectable East Asian ancestry).

Finally, there is much discussion of a late occupation of Northeast Europe by farmers. Since I predicted this 10 years ago I don’t have much objection to this section…except I don’t think that it supports his other points at all. That is, the persistence of hunter-gatherer populations around the Baltic does not mean that hunter-gatherers were more similar to farmers than we might think, nor does it reject the likelihood of total replacement in many areas of Europe to the south.

The overall conclusion here is two-fold:

  1. The assertions about pigmentation are not necessarily wrong, but they are far weaker based on the data that might be inferred from the post. Additionally, modern Europeans have lots of evidence of recent selection and allele frequency change at several of these loci.
  2. The assertions about very large misestimations of inferred mixing proportions are probably wrong.

December 1, 2012

Northern Europeans and Native Americans are not more closely related than previously thought

A new press release is circulating on the paper which I blogged a few months ago, Ancient Admixture in Human History. Unlike the paper, the title of the press release is misleading, and unfortunately I notice that people are circulating it, and probably misunderstanding what is going on. Here’s the title and first paragraph:

Native Americans and Northern Europeans More Closely Related Than Previously Thought

Released: 11/30/2012 2:00 PM EST
Source: Genetics Society of America

Newswise — BETHESDA, MD – November 30, 2012 — Using genetic analyses, scientists have discovered that Northern European populations—including British, Scandinavians, French, and some Eastern Europeans—descend from a mixture of two very different ancestral populations, and one of these populations is related to Native Americans. This discovery helps fill gaps in scientific understanding of both Native American and Northern European ancestry, while providing an explanation for some genetic similarities among what would otherwise seem to be very divergent groups. This research was published in the November 2012 issue of the Genetics Society of America’s journal GENETICS

 

The reality is ta Native Americans and Northern Europeans are not more “closely related” genetically than they were before this paper. There has been no great change to standard genetic distance measures or phylogeographic understanding of human genetic variation. A measure of relatedness is to a great extent a summary of historical and genealogical processes, and as such it collapses a great deal of disparate elements together into one description. What the paper in Genetics outlined was the excavation of specific historically contingent processes which result in the summaries of relatedness which we are presented with, whether they be principal component analysis, Fst, or model-based clustering.

What I’m getting at can be easily illustrated by a concrete example. To the left is a 23andMe chromosome 1 “ancestry painting” of two individuals. On the left is me, and the right is a friend. The orange represents “Asian ancestry,” and the blue represents “European” ancestry. We are both ~50% of both ancestral components. This is a correct summary of our ancestry, as far as it goes. But you need some more information. My friend has a Chinese father and a European mother. In contrast, I am South Asian, and the end product of an ancient admixture event. You can’t tell that from a simple recitation of ancestral quanta. But it is clear when you look at the distribution of ancestry on the chromosomes. My components have been mixed and matched by recombination, because there have been many generations between the original admixture and myself. In contrast, my friend has not had any recombination events between his ancestral components, because he is the first generation of that combination.

So what the paper publicized in the press release does is present methods to reconstruct exactly how patterns of relatedness came to be, rather than reiterating well understood patterns of relatedness. With the rise of whole-genome sequencing and more powerful computational resources to reconstruct genealogies we’ll be seeing much more of this to come in the future, so it is important that people are not misled as to the details of the implications.

Northern Europeans and Native Americans are not more closely related than previously thought

A new press release is circulating on the paper which I blogged a few months ago, Ancient Admixture in Human History. Unlike the paper, the title of the press release is misleading, and unfortunately I notice that people are circulating it, and probably misunderstanding what is going on. Here’s the title and first paragraph:

Native Americans and Northern Europeans More Closely Related Than Previously Thought

Released: 11/30/2012 2:00 PM EST
Source: Genetics Society of America

Newswise — BETHESDA, MD – November 30, 2012 — Using genetic analyses, scientists have discovered that Northern European populations—including British, Scandinavians, French, and some Eastern Europeans—descend from a mixture of two very different ancestral populations, and one of these populations is related to Native Americans. This discovery helps fill gaps in scientific understanding of both Native American and Northern European ancestry, while providing an explanation for some genetic similarities among what would otherwise seem to be very divergent groups. This research was published in the November 2012 issue of the Genetics Society of America’s journal GENETICS

 

The reality is ta Native Americans and Northern Europeans are not more “closely related” genetically than they were before this paper. There has been no great change to standard genetic distance measures or phylogeographic understanding of human genetic variation. A measure of relatedness is to a great extent a summary of historical and genealogical processes, and as such it collapses a great deal of disparate elements together into one description. What the paper in Genetics outlined was the excavation of specific historically contingent processes which result in the summaries of relatedness which we are presented with, whether they be principal component analysis, Fst, or model-based clustering.

What I’m getting at can be easily illustrated by a concrete example. To the left is a 23andMe chromosome 1 “ancestry painting” of two individuals. On the left is me, and the right is a friend. The orange represents “Asian ancestry,” and the blue represents “European” ancestry. We are both ~50% of both ancestral components. This is a correct summary of our ancestry, as far as it goes. But you need some more information. My friend has a Chinese father and a European mother. In contrast, I am South Asian, and the end product of an ancient admixture event. You can’t tell that from a simple recitation of ancestral quanta. But it is clear when you look at the distribution of ancestry on the chromosomes. My components have been mixed and matched by recombination, because there have been many generations between the original admixture and myself. In contrast, my friend has not had any recombination events between his ancestral components, because he is the first generation of that combination.

So what the paper publicized in the press release does is present methods to reconstruct exactly how patterns of relatedness came to be, rather than reiterating well understood patterns of relatedness. With the rise of whole-genome sequencing and more powerful computational resources to reconstruct genealogies we’ll be seeing much more of this to come in the future, so it is important that people are not misled as to the details of the implications.

September 7, 2012

Across the sea of grass: how Northern Europeans got to be ~10% Northeast Asian

The Pith: You’re Asian. Yes, you!

A conclusion to an important paper, Nick Patterson, Priya Moorjani, Yontao Luo, Swapan Mallick, Nadin Rohland, Yiping Zhan, Teri Genschoreck, Teresa Webster, and David Reich:

In particular, we have presented evidence suggesting that the genetic history of Europe from around 5000 B.C. includes:

1. The arrival of Neolithic farmers probably from the Middle East.

2. Nearly complete replacement of the indigenous Mesolithic southern European populations by Neolithic migrants, and admixture between the Neolithic farmers and the indigenous Europeans in the north.

3. Substantial population movement into Spain occurring around the same time as the archaeologically attested Bell-Beaker phenomenon (HARRISON, 1980).

4. Subsequent mating between peoples of neighboring regions, resulting in isolation-by-distance (LAO et al., 2008; NOVEMBRE et al., 2008). This tended to smooth out population structure that existed 4,000 years ago.

Further, the populations of Sardinia and the Basque country today have been substantially less influenced by these events.

 

It’s in Genetics, Ancient Admixture in Human History. Reading through it I can see why it wasn’t published in Nature or Science: methods are of the essence. The authors review five population genetic statistics of phylogenetic and evolutionary genetic import, before moving onto the novel results. ...

August 21, 2012

The Sardinian meter

I cropped the image above from the paper Inference of Population Structure using Dense Haplotype Data. The main reason was emphasize the distinctiveness of the Sardinian cluster, on the bottom right. As you can see this population exhibits a lot of coancestry across individuals. This isn’t too surprising, Sardinia is an island, and islands are often genetically distinctive. Random genetic drift prevents populations from diverging through gene flow, but water is a major impediment to gradual isolation by distance dynamics. The original Sardinians are naturally going to diverge from mainlanders over time, and begin to share the same set of common ancestors in the recent past, because their space of reasonable mating possibilities is constrained. The other population which is similar in the heat map above are the residents of the Orkneys, off the north coast of Scotland (the Orkneys has a much smaller population than Sardinia, but, it is also much closer to the mainland).

This is on my mind because Dienekes has a long post where he explores the D-statistic results of various European populations, using Sardinians ...

July 15, 2010

Really fine grained genetic maps of Europe

Filed under: European genetics,European Genomics,Genomics,History,Inbreeding — Razib Khan @ 12:41 am

genmap1A few years ago you started seeing the crest of studies which basically took several hundred individuals (or thousands) from a range of locations, and then extracted out the two largest components of genetic variation from the hundreds of thousands of  variants. The clusters which fell out of the genetic data, with each point being an individual’s position, were transposed onto a geographical map. The figure to the left (from this paper)   has been widely circulated. You don’t have to be a deep thinker to understand why things shake out this way; people are more closely related to those near than those far because gene flow ties populations together, and its power decreases as a function of distance.

Of course the world isn’t flat, and history perturbs regularities. Jews for example often don’t shake out where they “should” geographically, because of their historical mobility contingent upon random and often capricious geopolitical or social pressures. The Hazara of Afghanistan have their ethnogenesis in the melange of peoples who were thrown together after the Mongol conquest of Central Asia and Iran in the 13th century, and the subsequent collapse of the Ilkhan dynasty. Though the Hazara have mixed with their Persian, Tajik and Pashtun neighbors, they still retain a strong stamp of Mongolian ancestry which means that they are at some remove on the “genetic map” from their geographical neighbors.


So when interpreting these sorts of results you have two extreme dynamics operative. On the one hand you have an equilibrium state where gene flow is mediated through continuous but small flows of migration; women moving between villages, younger sons venturing out of the village in search of better opportunities. Then you have the random (or perhaps modeled as a poisson distribution) “shocks” which are attributed to world-historical (or region-historical) events which leave an outsized and often perplexing stamp and distort the genetic map from the geographic one. Sometimes the two are not in balance. In much of the New World and Australasia the native populations were genetically replaced by settlers from the outside. Thousands of years of genetic variation accumulated and shaped by localized gene flow events were wiped clean off the map by the demographic tsunami.

Obviously that’s an extreme scenario. The macroscale does not always render the microscale irrelevant in such a fashion. A new short paper in The European Journal of Human Genetics gives us an example. Genes predict village of origin in rural Europe:

The genetic structure of human populations is important in population genetics, forensics and medicine. Using genome-wide scans and individuals with all four grandparents born in the same settlement, we here demonstrate remarkable geographical structure across 8–30 km in three different parts of rural Europe. After excluding close kin and inbreeding, village of origin could still be predicted correctly on the basis of genetic data for 89–100% of individuals.

Here’s the ubiquitous PC chart, except on the scale of villages:

village1

As noted above they excluded close relatives, out to second cousins. They judge the genetic time depth is about ~120 years into the past back to the common ancestry. Remember that if their grandparents are from this village they obviously are going to be somewhat inbred, from the perspective of an American whose ancestors are from different nations. But for most of history the European case was the typical one, not the American one where people from different continents mingled.

Here’s part of the discussion which I think needs highlighting:

To explore how many markers are required to recover these fine scale patterns of structure, we ranked SNPs by FST among villages and repeated the PCA for the most differentiated subsets of 30 000, 10 000, 3000 and 300 SNPs in each population. In all three populations, 10 000 or more high FST SNPs recovered an essentially identical picture to that using the full data set, and even 3000 SNPs preserved considerable separation between the villages (not shown). Using only the most discriminating 300 SNPs, little structure could be observed between the two Croatian villages; however, in Scotland and Italy one of the three settlements included in each location remained completely differentiated from the other two (not shown). We note that these results are only indicative of the minimum number of SNPs required to separate these populations, as by necessity SNPs have been selected intrinsically on the basis of FST within the same data set, rather than extrinsically from other data.

The slightly lower differentiation of the Croatian villages is not surprising given the fact that they are physically the closest of those considered here, being 8 km apart, with only low hills separating them. In contrast, the settlements in the Scottish Isles and Italy are separated by 15–30 km of sea in the former case, and of 3000 m mountains in the latter, although there are deep connecting valleys.

First, we get a sense of the range of informative markers necessary to discern population structure well in much of the Old World. For continental races (e.g., Europeans vs. East Asians) you need on the order of 10-100 markers to distinguish them with a high degree of confidence (closer to the low bound than the high). It looks like in the case of village vs. village differences, it will be on the order of 100-1000 markers. I suspect in Iraq or the Caucasus you’ll need less than 300 markers, because genetic differentiation is higher over a shorter distance due to inbreeding, ethnic diversity, and geography (more the former in Iraq, more the latter in the Caucasus). In contrast, in regions where geography is conducive to transport and local norms enforce exogamy  I wouldn’t be surprised if you need more like a thousand markers.

Second, observe the importance of topographical detail. I have observed before than Sardinia is a genetic outlier in Europe. That’s not because Sardinians interbred with native elves of that island. Rather, a water barrier serves as a major check on continuous gene flow mediated by banal contacts (e.g., going to the market and meeting a person from the neighboring village). Islands become worlds unto themselves. Though they are effected by the exogenous shocks, they are less subject to the continuous gene flow at the equilibrium because the water serves as a barrier. Similarly mountains can produce genetic barriers as well, because they make travel rather difficult. In Consanguinity, Inbreeding, and Genetic Drift in Italy L. L. Cavalli-Sforza documents in detail through Roman Catholic Church records what a big impact modern roads had on inbreeding coefficients, which plunged in the 19th century. Distortions of the genetic map tells about variations in elevation in the third dimension on the geographic map!

The utility of this sort of data collection and analysis in the modern world is an empirical question. On the one hand many Europeans are relatively less inclined to move in comparison to Americans. And yet the breaking down of borders with the European Union and the likely need for a more productive economic sector on that continent because of changing demographics point to greater mobility, migration and mixing, which would make these sorts of studies of only near-term use. Of more interest to me are going to be fine-grained analyses of social groups. For example the Indian caste system. Last fall in the Reich et al. paper the authors seemed to be indicating the likelihood of a lot of between population variance groups these groups. It doesn’t matter if a particular Bania sub-caste from Gujarat is scattered across the world, from Kenya to England to the United States. They may all still marry amongst a set of individuals who hale from the same original few villages.

Good times.

Citation: O’Dushlaine, C., McQuillan, R., Weale, M., Crouch, D., Johansson, Aulchenko, Y., Franklin, C., Polašek, O., Fuchsberger, C., Corvin, A., Hicks, A., Vitart, V., Hayward, C., Wild, S., Meitinger, T., van Duijn, C., Gyllensten, U., Wright, A., Campbell, H., Pramstaller, P., Rudan, I., & Wilson, J. (2010). Genes predict village of origin in rural Europe European Journal of Human Genetics DOI: 10.1038/ejhg.2010.92

Powered by WordPress