About five years ago Kari Stefansson published an interesting paper, A common inversion under selection in Europeans. The basic thrust of the results was that a particular genomic region in Europeans exhibited a pattern of variation whereby there was one variant which was inverted in relation to the modal type. They labelled them “H2″ and “H1″ respectively. The region in question is spans ~900 kilobases on chromosome 17 and has within it the MAPT gene which is implicated in several neurological diseases. Stefansson et al. argued that H2 and H1 were long coexistent lineages, prevented from recombining due to the molecular genetic constraints of the chromosomal inversion, and each preserved within several human populations by balancing selection dynamics. That is, natural selection exhibited dynamics whereby neither variant could replace the other because their fitness was optimized at intermediate frequencies. In the human population as a whole H2 is far less common and seems to have less genetic variation. In the Icelandic population they also found that H2 seemed to be correlated with greater fertility, suggesting that natural selection was currently operating upon it (any trait correlated with fertility is naturally a more “fit” one).
A new paper puts the focus once more on this region, but takes a broader view by moving away from looking at Icelanders as a test population and surveying a wider range of peoples, as well as comparing the genetic variation in this region across primate species. The Distribution and Most Recent Common Ancestor of the 17q21 Inversion in Humans:
The polymorphic inversion on 17q21, sometimes called the microtubular associated protein tau (MAPT) inversion, is an ∼900 kb inversion found primarily in Europeans and Southwest Asians. We have identified 21 SNPs that act as markers of the inverted, i.e., H2, haplotype. The inversion is found at the highest frequencies in Southwest Asia and Southern Europe (frequencies of ∼30%); elsewhere in Europe, frequencies vary from < 5%, in Finns, to 28%, in Orcadians. The H2 inversion haplotype also occurs at low frequencies in Africa, Central Asia, East Asia, and the Americas, though the East Asian and Amerindian alleles may be due to recent gene flow from Europe. Molecular evolution analyses indicate that the H2 haplotype originally arose in Africa or Southwest Asia. Though the H2 inversion has many fixed differences across the ∼900 kb, short tandem repeat polymorphism data indicate a very recent date for the most recent common ancestor, with dates ranging from 13,600 to 108,400 years, depending on assumptions and estimation methods. This estimate range is much more recent than the 3 million year age estimated by Stefansson et al. in 2005.
Note that the differences between H1 and H2 are not simply ones of particular marker SNPs, particular variants of H1 carry versions of MAPT which exhibit far greater transcriptional activity than those on H2. And there are significant differences between the two genomic variants when it comes to correlations with disease susceptibility, even if the underlying mechanistic relations have not been elucidated.
In any case one of the more intriguing aspects of this paper is that they looked at 66 human populations (a mix of Alfred and HGDP) and our nearest evolutionary relatives among the apes. The sample size for apes was only 15, and the results seemed a bit muddled (or perhaps the prose in this region was a bit unclear). They identified H1 and H2 by runs of particular alleles, a sequence of genetic variants diagnostic to H1 or H2. In some regions the various apes seemed to resemble H1 and in others H2. Interestingly on sites where H1 is polymorphic the ape samples seem to resemble H2, implying that the genetic background against which H2 arose was rather ancient (since the divergence from apes is an ancient event). And, of alleles where H1 was polymorphic and H2 had an allele which was in H1, in four out of five cases H2 was ancestral.
The results from human populations are easier to visual because there’s a map associated:
Really this isn’t a “European” variant. Here’s the relevant text:
The inversion haplotype is found at highest frequency in Mediterranean regions of Southwest Asia and Europe (31.6% in Druze, 31% in Samaritans, 23.5% in Palestinians, 26% in Bedouins, 23.9% in French Basques, 32.2% in Spanish Basques, 20.9% in Catalans, 27.7% in Greeks, 37.5% in Sardinians, 31.9% in Toscani, and 36.8% in Roman Jews) and at moderately high levels in Northern Africa (13.3% in the Mozabite). It is also found at a high frequency in Ashkenazi Jews (25.6%), which we have shown to group with the Southwest Asians….Elsewhere in Europe, we see that the frequency is high in Western, Central, and Southeast Europe (18.9% in French, 15% in Danes, 17.7% in the Irish, 28% in Orcadians, 21.4% in European Americans, 23.9% in Hungarians, and 15.7% in the Adygei) and much lower in Eastern and Northern Europe (9.8% in the Chuvash, 6% in the Archangel Russians, 9.4% in the Vologda Russians, and 4.3% in the Finns) and on the Arabian Peninsula (11.9% in the Yemenite Jews and 9.4% in Kuwaitis).
Since they mention it, I thought I would quickly post a map of the spread of farming in Europe. Darker represents earlier dates for the dominance of agriculture within a particular region. H2 is found in other Eurasian populations as well, though at lower frequencies, from from ~10% in the Arabian peninsula and in Pakistan to ~3% in South India. It is absent in East Asia, and there are strong suspicions that its presence in the Amerindian samples is due to recent admixture (this is something that regularly crops up with these HGDP samples). But, importantly it is also notable that H2 is present as low frequencies in many African populations, including the Pygmies (though seemingly almost absent from West Africa). If H2 is very ancient (as Stefansson et al. argue) then its origins lay in Africa, and it was introduced to Eurasia by the Out of Africa expansion which saw the replacement of archaic H. sapiens by anatomically modern H. sapiens from Africa. Its later higher frequency in parts of western Eurasia could be due to demographic parameters such as random genetic drift through a bottleneck, or localized natural selection, or a combination. If H2 arose in the Middle East its presence in Africa could be explained by back-migration. I immediately was skeptical of this model because H2 is extant at frequencies of 5% among the Mbuti Pygmies. The Mbuti are relatively isolated genetically from the Bantu farmers who have come to dominate their region. If there was any group which represented the ancient genetic variation of Central Africa, it is likely the Mbuti. There are suggestive patterns in the data of this paper which points to an African origin for H2 originally:
We identified an H1 haplotype (blue stripes) that differs from the H2 haplotype (red stripes) only at the inversion marker sites and is therefore the likely haplotype on which the inversion initially arose. This haplotype is found throughout the world at an average frequency of 7.8%. It is most frequent in Africa ranging from 6.9% in the Mbuti Pygmies to 25% in the Biaka Pygmies with an average frequency of 14.8%. It is much less frequent in Southwest Asia, ranging from 4.8%–9.2% with an average frequency of 6.5%. These data support an African origin of the inversion, but are not sufficient to rule out a Southwest Asian origin.
Haplotypes, sequences of genetic variants, can be related to each other on a phylogenetic tree. There are haplotypes which have more derived variants, and those which have more ancestral variants. It seems that the African H2 variants are more likely to be that which arose from the genetic background of H1. So in this model the high frequency of H2 in the Middle East is not due to time of residence, but a function of random processes or natural selection.
But perhaps the most interesting find in this paper is their result that H2 is relatively recently derived in relation to H1, as opposed having diverged 3 million years ago as implied by Stefansson et al. They looked at the variation on short tandem repeats and using a molecular clock method inferred the point of coalescence back to an ancestral lineage. Here’s what they found:
Assuming an average generation time of 25 years, this puts the MCRA at 16,400–32,800 years ago. However, if we assume that the African haplotype is the ancestral haplotype, we get an estimate of 2167.4–4334.7 generations. With 25 years per generation, this puts the MCRA at 54,200–108,400 years ago.
This recent date for the MRCA is also supported by our SNP data. Of the 90 SNPs typed, only four were variable on the H2 chromosomes, whereas 68 of the 90 are variable on the H1 chromosomes. This lack of polymorphism on the H2 chromosomes in comparison to H1 chromosomes would suggest that the H2 inversion is younger than the H1 orientation.
The first number assumes that the Middle Eastern variant is the ancestral one, while the second the African. Unlike the authors I suspect that the African variant is probably ancestral to a relatively high degree of confidence because of the Mbuti data point. This would place the emergence of H2 from the H1 background around the time of the Out of Africa migration. Humans would have exhibited polymorphism on this locus before they emigrated. Since H2 is not found in West Africa it may even reflect population substructure within Africa from before the Out of Africa migration (Eurasians are derived from northeast Africans).
And yet remember the earlier data using non-human primates which implied that perhaps H2 was the more ancient variant? From the discussion you can see how the authors resolve the conflict here:
Given the global distribution described here combined with the data of Zody et al., we propose a model in which the H2 orientation is the NHP [non-human primate] ancestral orientation; however, the H1 orientation is ancestral in humans. Under this theory, sometime after the divergence of Pan and Homo the region inverted to the H1 orientation in the Homo line. The H1 then rose to fixation. Then, in modern humans the inversion occurred once again, leading to the H2 chromosomes found in humans. Zody et al. showed that the region is susceptible to inversion, so it is not impossible to imagine an inversion occurring twice on the Homo line
The authors point out that some have suggested that the H2 inversion might have jumped from archaic H. sapiens into west Eurasian populations, in particular Neandertals. This would explain the lack of recombination, as two distinct breeding populations would naturally not recombine their genetic material. The authors seem skeptical of this finding, and I am again more skeptical than they because I assume an African origin for H2 to a higher degree of confidence than they do. That being said they note that further reconstruction of the Neandertal genome would likely solve this dispute.
Finally they touch upon the question of neutral vs. adaptive dynamics. That is, can the frequencies of H2 vs. H1 be explained by a combination of various demographic parameters such as random genetic drift and subsequent admixture between isolated populations, or natural selection whereby traits on H2 entailed a higher fitness for H2, ergo it rose in frequency across disparate populations. Naturally the two do not necessarily exclude each other. A simple neutral model would explain the lack of H2 in East Asia through genetic drift, as populations go through serial bottlenecks most genetic variation is lost, and a few lineages predominate. So H2 went extinct in East Asia via this model. In the Middle East H2 rose in frequency through random forces and then was spread to Europe through the migration of Neolithic farmers.
As the authors have already complexified the history of this genomic region, proposing two inversions to render explicable the particular patterns on H1 and H2 and their relation to non-human primates, I think there is no need to hew too closely to the principle of parsimony. It is suggestive to me that H2 is found at high frequencies in the Middle East, the region where agriculture arose first, and can be seen to correlate with regions where agriculturalists later settled. It may be that genes on H2 are useful for agriculturalists, at least as a form of balancing selection whereby the fitness of H2 decreases as its frequency rises, converging upon an equilibrium proportion with H1. These may be behavioral, recall that MAPT is implicated in neurological function, and it differs along the two lineages. Additionally, H2 may have spread to Europe with agriculture and the agriculturalists. The very low frequency of H2 among Finns is in line with my suggestion that northeastern Europe is the refugium for the pre-Neolithic genetic substrate of the continent. The low frequency among the Finns may be a function of their low rates of admixture with farmers whose original genetic signal was from the Middle East as well as the fact that the Finns adapted a fully agricultural lifestyle relatively late, so selective pressures for H2 was weak until relatively recently.
I’ll let the authors finish:
We have shown here that the 17q21 inversion is found at its highest frequencies in the Mediterranean region in Southern Europe, Southwest Asia, and North Africa. We have also shown that the MRCA of the inversion is much younger than the estimated date of divergence for the H1 and H2 haplotypes. Though we cannot rule out selection acting at the region, we think that both the restricted global distribution and the recent MRCA fit with a neutral model coinciding with an origin in Africa or Southwest Asia followed by demographic events occurring during the migration out of Africa into Southwest Asia and/or the Neolithic expansion out of Southwest Asia into Europe.
Citation: Donnelly, M., Paschou, P., Grigorenko, E., Gurwitz, D., Mehdi, S., Kajuna, S., Barta, C., Kungulilo, S., Karoma, N., & Lu, R. (2010). The Distribution and Most Recent Common Ancestor of the 17q21 Inversion in Humans The American Journal of Human Genetics, 86 (2), 161-171 DOI: 10.1016/j.ajhg.2010.01.007