Razib Khan One-stop-shopping for all of my content

April 4, 2017

How Tibetans can function at high altitudes

Filed under: Altitude Adaptation,Evolution,Genetics,Genomics,Human Evolution,Tibetans — Razib Khan @ 11:10 am

About seven years ago I wrote two posts about how Tibetans manage to function at very high altitudes. And it’s not just physiological functioning, that is, fitness straightforwardly understood. High altitudes can cause a sharp reduction in reproductive fitness because women can not carry pregnancies to term. In other words, high altitude is a very strong selection pressure. You adapt, or you die off.

For me there have been two things of note since those original papers came out. First, one of those loci seem to have been introgressed from a Denisovan genetic background. I want to be careful here, because the initial admixture event may not have been into the Tibetans proper, but earlier hunter-gatherers who descend from Out of Africa groups, who were assimilated into the Tibetans as they expanded 5-10,000 years ago. Second, it turns out that dogs have been targeted for selection on EPAS1 as well (the “Denisovan” introgression) for altitude adaptation as well.

This shows that in mammals at least there’s a few genes which show up again and again. The fact that EPAS1 and EGLN1 were hits on relatively small sample sizes also reinforces their powerful effect. When the EPAS1 results initially came out they were highlighted as the strongest and fastest instance of natural selection in human evolutionary history. One can quibble about the details about whether this was literally true, but that it was a powerful selective event no one could deny.

A new paper in PNAS, Genetic signatures of high-altitude adaptation in Tibetans, revisits the earlier results with a much larger sample size (the research group is in China) comparing Han Chinese and Tibetans. They confirm the earlier results, but, they also find other loci which seem likely targets of selection in Tibetans. Below is the list:

SNP A1 A2 Frequency of A1 P value FST Nearest gene
Tibetan EAS (Han)
rs1801133 A G 0.238 0.333 6.30E-09 0.021 MTHFR
rs71673426 C T 0.102 0.013 1.50E-08 0.1 RAP1A
rs78720557 A T 0.498 0.201 4.70E-08 0.191 NEK7
rs78561501 A G 0.599 0.135 6.10E-15 0.414 EGLN1
rs116611511 G A 0.447 0.003 3.60E-19 0.57 EPAS1
rs2584462 G A 0.211 0.549 3.90E-09 0.203 ADH7
rs4498258 T A 0.586 0.287 1.70E-08 0.171 FGF10
rs9275281 G A 0.095 0.365 1.10E-10 0.162 HLA-DQB1
rs139129572 GA G 0.316 0.449 5.80E-09 0.036 HCAR2
P value indicates the P value from the MLMA-LOCO analysis. FST is the FST value between Tibetans and EASs. Nearest gene indicates the nearest annotated gene to the top differentiated SNP at each locus except EGLN1, which is known to be associated with high-altitude adaptation; rs139129572 is an insertion SNP with two alleles: GA and G. A1, allele 1; A2, allele 2.

Many of these genes are familiar. Observe the allele frequency differences between the Tibetans and other East Asians (mostly Han). The sample sizes are on the order of thousands, and the SNP-chip had nearly 300,000 markers. What they found was that the between population Fst of Han to Tibetan was ~0.01. So only 1% of the SNP variance in their data was partitioned between the two groups. These alleles are huge outliers.

The authors used some sophisticated statistical methods to correct for exigencies of population structure, drift, admixture, etc., to converge upon these hits, but even through inspection the deviation on these alleles is clear. And as they note in the paper it isn’t clear all of these genes are selected simply for hypoxia adaptation. MTFHR, which is quite often a signal of selection, may have something to due to folate production (higher altitudes have more UV). ADH7 is part of a set of genes which always seem to be under selection, and HLA is never a surprise.

Rather than get caught up in the details it is important to note here that expansion into novel habitats results in lots of changes in populations, so that two groups can diverge quite fast on functional characteristics.  The PCA makes it clear that Tibetans and Hans have very little West Eurasian admixture, and the Fst based analysis puts their divergence on the order of 5,000 years before the present. The authors admit honestly that this is probably a lower bound value, but I also think it is quite likely that Tibetans, and probably Han too, are compound populations, and a simple bifurcation model from a common ancestral population is probably shaving away too many realistic edges. In plainer language, there has been gene flow between Han and Tibetans probably <5,000 years ago, and Tibetans themselves probably assimilated more deeply diverged populations in the highlands as they expanded as agriculturalists. An estimate of a single divergence fits a complex history to too simple of a model quite often.

The take home: understanding population history is probably important to get a better sense of the dynamics of adaptation.

Citation: Jian Yang, Zi-Bing Jin, Jie Chen, Xiu-Feng Huang, Xiao-Man Li, Yuan-Bo Liang, Jian-Yang Mao, Xin Chen, Zhili Zheng, Andrew Bakshi, Dong-Dong Zheng, Mei-Qin Zheng, Naomi R. Wray, Peter M. Visscher, Fan Lu, and Jia Qu, Genetic signatures of high-altitude adaptation in Tibetans, PNAS 2017 ; published ahead of print April 3, 2017, doi:10.1073/pnas.1617042114

October 20, 2010

Genetic watersheds on the Great Himalayas


One of the great geological landmarks on earth are the Himalayas. Not only are the Himalayas of importance in the domain of physical geography, but they are important in human geography as well. Just as South Asians and non-South Asians agree that the valley of the Indus and its tributaries bound the west of the Indian cultural world, so the Himalayas bound it on the north. Unlike many pre-modern constructions, such as the eastern boundary of Europe, the northern limit of South Asia is relatively clear and distinct. It is stark on a relief map; the flat Gangetic plain gives way to mountains. And it is stark a cultural map, the languages of northern India give way to those of the world of Tibet. The religion of northern India gives way to the Buddhism of Tibet. In terms of human geography I believe that one can argue that the Himalayan fringe around South Asia exhibits the greatest change of ancestrally informative gene frequencies over the smallest distance when you exclude those regions separated by water barriers. Unlike the Sahara the transect from the northern India to Chinese Tibet at any given point along the border is permanently inhabited, albeit sparsely at the heights.

ResearchBlogging.orgAnd yet despite the geographical barriers people and ideas did move across the Himalaya. The cultural influences upon Tibet from India are obvious. The script of Tibet is derived from India, while its form of Buddhism is the direct descendant of the last efflorescence of that religion in northern India. But while culture moved north, I do not see much evidence genetically that South Asians have been significant as an influence. This is somewhat shocking when you realize these two facts: the population of the Tibetan Autonomous Region is on the order of 5-6 million, while that of northern South Asia around ~1 billion (including Pakistan and Bangladesh). A 200-fold difference. And yet there is evidence of admixture between the two groups exactly where you’d expect: in Nepal. Below is a figure from a recent paper which shows how South and East Asian populations relate to each other. I’ve highlighted the Nepali groups, which span the two larger classes:


tibetsouth2From the above figure it’s clear that there is considerable admixture among the Indo-European populations of Nepal with a Tibetan element. The Magar are a tribe which is representative of Tibet, with little South Asian genetic input presumably. The Newar are the Nepalese hybrids par excellence. To a great extent they can be viewed as the indigenous peoples of the Kathmandu region at the heart of modern Nepal. Their language is of Tibetan affinity, and yet it is heavily overlain with an Indo-Aryan aspect, and seems to have within it an ancient Austro-Asiatic substrate. Though predominantly Hindu today, the Newar have a substantial Buddhist minority whose roots may go back to the original Mahayana traditions which were once prominent in northern India. The Brahmin and Chetri groups are upper caste communities who claim provenance from the north Indian plain. Some of these upper caste groups in Nepal are of recent vintage, having fled the Islamic conquests of the Gangetic plain within the last 1,000 years. And yet even they have obvious Tibetan admixture. This should make one cautious about the excessive claims to genetic purity which South Asian caste groups make.

But admixture of a Tibetan or East Asian component in South Asia is not limited to Nepal. I have reedited a figure from a 2006 paper on Indian Americans which shows the inferred components of ancestry of various language-groups. It is clear that the northeastern groups, Bengalis, Assamese, and Oriya, have an affinity to East Asians. This is not just ancient east Eurasian ancestry, the “Ancestral South Indians” hypothesized in Reich et al.. The South Indian groups (which I have excised from the figure) do not exhibit the same level of elevation of the ancestral quantum dominant among the Han Chinese in the bar plot. In fact the Reich et al. paper also reported evidence of an eastern ancestral element in some of the Munda speaking groups of northeast India. This stands to reason as the Munda are a South Asian branch of the Austro-Asiatic family of Southeast Asia. But much of it may also be more recent, as groups such as the Ahom of Assam and the Chakma of Bangladesh seem to have arrived from Burma of late.

So we see that genes do flow around the margins of South Asia, and into it. And yet Tibet seems oddly insulated. Why? Because of adaptation. Like water, it seems in this case genes tend to flow downhill, not up, and the reason is likely the fitness differentials between lowland and highland populations along the slope of the Great Himalayas. A new paper in PNAS explores the issue by examining genetic variation among Indians, Tibetans, and worldwide populations, in relation to hypoxia implicated loci. EGLN1 involvement in high-altitude adaptation revealed through genetic analysis of extreme constitution types defined in Ayurveda:

It is being realized that identification of subgroups within normal controls corresponding to contrasting disease susceptibility is likely to lead to more effective predictive marker discovery. We have previously used the Ayurvedic concept of Prakriti, which relates to phenotypic differences in normal individuals, including response to external environment as well as susceptibility to diseases, to explore molecular differences between three contrasting Prakriti types: Vata, Pitta, and Kapha. EGLN1 was one among 251 differentially expressed genes between the Prakriti types. In the present study, we report a link between high-altitude adaptation and common variations rs479200 (C/T) and rs480902 (T/C) in the EGLN1 gene. Furthermore, the TT genotype of rs479200, which was more frequent in Kapha types and correlated with higher expression of EGLN1, was associated with patients suffering from high-altitude pulmonary edema, whereas it was present at a significantly lower frequency in Pitta and nearly absent in natives of high altitude. Analysis of Human Genome Diversity Panel-Centre d’Etude du Polymorphisme Humain (HGDP-CEPH) and Indian Genome Variation Consortium panels showed that disparate genetic lineages at high altitudes share the same ancestral allele (T) of rs480902 that is overrepresented in Pitta and positively correlated with altitude globally (P< 0.001), including in India. Thus, EGLN1 polymorphisms are associated with high-altitude adaptation, and a genotype rare in highlanders but overrepresented in a subgroup of normal lowlanders discernable by Ayurveda may confer increased risk for high-altitude pulmonary edema.

The paper itself is a follow up to a previous work attempting to see if there was a sense to the classification of constitutions found within Ayurvedic medicine. Like Chinese medicine this is a non-Western tradition which has different philosophical roots and axioms (Galenic medicine might be analogous). But in theory all medical traditions emerged to battle illness, so their target was unitary, the ailments which plague the human body. Therefore one might suppose that in fact there would be some sense in any long-standing medical tradition which has any empirical grounding, because human biology is relatively invariant. It is the relative clause which is of interest for the purposes of this paper, because the authors show how the classifications of Ayurvedic medicine seem to comport with the recent genetic evidence of high altitude adaptation! Specifically they found that particular Ayurvedic classes of individuals who seem to have negative reactions to high altitude exposure in the form of hypoxia tend to be carriers of particular EGLN1 genotypes.

I will at this point observe that since I don’t know much about Ayurveda I won’t address or cover that issue in detail. The paper is Open Access so you can read it yourself. So let’s move to the genetics. EGLN1 should be familiar to you by now. It’s cropped up repeatedly over the past year in studies of high altitude adaptation. It is a locus which seems to be a target of selection in both the peoples of the Andes and Tibet. Additionally, it has a peculiar aspect where the ancestral variant, the one found most frequently within Africa, seems to be the target of selection for altitude adaptation outside of Africa.

The slideshow below is an overview of the primary figures within this paper.

What do we take away from this? Well, one aspect which I think is important to emphasize is that genetic background matters, and there’s much we don’t know. In the conclusion the authors note that the altitude adaptation papers which I alluded to above were not published when the manuscript was being written, so they were not privy to the rather repeated robust evidence that EGLN1 has been the target of natural selection, and that variation on the locus is correlated with variation in adaptation to higher altitudes. The widespread coverage of populations in this paper seems to almost obscure as much as highlight. What has African variation to do with this after all? Additionally one must always remember that one given marker on a gene which shows a correlation does not entail functional causation. We saw this with the markers which seemed to predict the odds of an individual of European ancestry having blue eyes; it turned out that the markers themselves were simply strongly associated with another SNP which was probably the real functional root behind the difference in phenotype.

Due to the replication of EGLN1 in both Andeans and Tibetans I am moderately confident that variation on this gene does have something to due with high altitude adaptation. What I am curious about is the fact that the ancestral alleles in many cases seem to be driven up on frequency. Is there an interaction between the genetic background of non-Africans and the SNPs in question which make it beneficial toward altitude adaptation? Was there an initial relaxation of function as human populations moved out of Africa, which was slammed back on at high altitudes? There does seem a correlation within South Asian populations between hypoxia and high altitudes and particular variants on EGLN1. Focusing just on this region we can draw some reasonable inferences, but taking a bigger picture view and encompassing the whole world we’re confronted with a rather more confused, and perhaps more interesting, picture.

Back to the specific issue of the lack of South Asian imprint on the genes of Tibetan peoples, I think one can chalk this up to the fact that humans are animals, and so we’re constrained by geography and biology. Tibetans can operate efficiently at lower altitudes, and so have mixed with South Asians in these regions. In contrast, South Asians can not operate at higher altitudes, and so the impact on Tibetans was purely cultural, and not genetic. More broadly this may also point to long term geopolitical implications: the Han Chinese demographic domination of Tibet is always going to be a matter of water flowing uphill. Unless of course we flesh out the genetic architecture of these traits well enough that the Chinese government knows exactly which individuals among the 1.2 billion Han population would be most biologically prepared to reside in the Tibetan Autonomous Region, and so can proactively recruit them to settle in Lhasa and other strategic locations.

Citation: Shilpi Aggarwal, Sapna Negi, Pankaj Jha, Prashant K. Singh, Tsering Stobdan, M. A. Qadar Pasha, Saurabh Ghosh, Anurag Agrawal, Indian Genome Variation Consortium, Bhavana Prasher, & Mitali Mukerji (2010). EGLN1 involvement in high-altitude adaptation revealed through genetic analysis of extreme constitution types defined in Ayurveda PNAS : 10.1073/pnas.1006108107

Image Credit: Wikimedia Commons

September 14, 2010

The silver age of altitude adaptation

tib1With all the justified concern about “missing heritability”, the age of human genomics hasn’t been a total bust. As I have observed before in 2005’s excellent book Mutants the evolutionary geneticist Armand M. Leroi asserted that we really didn’t have a good understanding of normal variation of human pigmentation. At the time I think it was a defensible claim, but within three years I’d say that most of the mystery had been cleared up. Though there are still some holes to be plugged, and details to be elucidated, the genetic architecture of pigmentation is now understood more or less. By the fall of 2006 Richard Sturm penned a review titled A golden age of human pigmentation genetics, an age I think which in some ways probably was closed with his 2009 review Molecular genetics of human pigmentation diversity. It’s not surprising that many of the traits that 23andMe tells you about have to do with your pigmentation. Of course there’s some limited utility in this, one assumes that most individuals don’t gain much benefit from the knowledge that they have an “85% change of having brown eyes,” though it may be useful in terms of offspring prediction (I would say I have an 85% chance of having brown eyes, but since I’m not European the genetic background isn’t right to make that probability assertion).

ResearchBlogging.orgBut as the golden age of pigmentation genetics comes to a close and the low hanging fruit is stripped bare, where next? I wonder if it may be altitude adaptations. Like pigmentation altitude genetics has been around for a while, but it seems there’s a recent cresting of papers in the area, focusing in particular on the three canonical high altitude peoples, the Tibetans, Andeans, and the Ethiopians. Last spring two major groups came out with papers on the genetics of Tibetan altitude adaptation, and its evolutionary history, using somewhat different techniques. A new paper in PLoS Genetics builds upon that work (verifying two of the loci as targets of selection in Tibetans implicated in the previous papers), and, adds Andean populations to the mix to assess the possibilities of convergent adaptations. Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data:

High-altitude hypoxia is caused by decreased barometric pressure at high altitude, and results in severe physiological stress to the human body. Three human populations have resided at high altitude for millennia including Andeans on the Andean Altiplano, Tibetans on the Himalayan plateau, and Ethiopian highlanders on the Semian Plateau. Each of these populations exhibits a unique suite of physiological changes to the decreased oxygen available at altitude. However, we are just beginning to understand the genetic changes responsible for the observed physiology. The aim of the current study was to identify gene regions that may be involved in adaptation to high altitude in both Andeans and Tibetans. Genomic regions showing evidence of recent positive selection were identified in these two high-altitude human groups separately. We found compelling evidence of positive selection in HIF pathway genes, in the globin cluster located on chromosome 11, and in several chromosomal regions for Andeans and Tibetans. Our results suggest that key HIF regulatory and targeted genes are responsible for adaptation to altitude and implicate several distinct chromosomal regions. The candidate genes and gene regions identified in Andeans and Tibetans are largely distinct from one another. However, one HIF pathway gene, EGLN1, shows evidence of directional selection in both high-altitude populations.

In this paper the authors looked at around 50 Andeans (Quechua and Aymara speakers) and 50 Tibetans, and compared them to various outgroups. In addition to the European and Asian HapMap populations they also looked at some Amerindian populations. The map below shows the geographical scope of their sampling (the right inset are the Amerindian lowland groups):


The ancestral relationships of the two highland groups sampled in relation to the lowlanders was relatively straightforward. Panel A and B show PCA plots for the Andeans and Tibetans, while C and D show frappe bar plots. The only thing notable for me is that the Quechua speakers seem to show residual European ancestry which the Aymara do not, and the Colombian indigenous groups seems to have more affinity with Mesoamerican populations than with the other South American samples. I can give no insight as to the latter, but if it is not just a quirk of non-representativeness one may be seeing the higher number of Spanish men who married into the nobility of the Quechua speaking highlands than further south in lands of the Aymara (though Potosi was in Bolivia, so this may not be plausible).

We already have some evolutionary expectations of how these groups came to have these adaptations to their high altitude environments. It seems that the physiological processes for the three groups are somewhat different, and this has been a source of curiosity for geneticists for a long time. It stands to reason if the physiology is somewhat varied, the genetics should be too, and that seems to be a broadly correct assumption. In this paper they took two general approaches, looking at the total genome, and focusing on specific candidate regions. From what I can tell they did not find much novel using the first technique, but they did clarify the relationship between Tibetans and Andeans in terms of their genetic adaptations a bit by looking at specific genes. As noted in the author summary it looks as if the two populations do have somewhat different genetic architectures. Many of the genes which seem to have been targets of selection do not overlap, and of those that do there seem different localized selection events so that the haplotypes being driven by positive selection differ.

They used a compound of techniques to detect possible regions of natural selection:

- locus specific branch length (LSBL)

- the log of ratio of heterozygosities (lnRH)

- a modified Tajima’s D statistic

- whole genome long range haplotype (WGRLH)

LSBL is an elaboration on Fst, so it is finding between population differences in allele frequency. Recall that at any given locus you don’t expect much between population difference, so if there is a great deal of ecological adaptation you may see a lot of variance as a function of geography. Heterozygosity is simply a measure of the fraction of loci where the two gene copies are in different states. It’s just a way to measure genetic variation (though there are others). The Tajima’s D statistic is a test for whether the locus seems deviated from neutral expectations. This means that there may have been a bottleneck, selective sweep, or, balancing selection. Finally, the last test looks for sets of correlated markers within the genome. If there is a haplotype, a sequence of markers, at high frequency then it may be that you’re witnessing a genomic region which is in, or just after, the occurrence of a selective sweep.

Why four different tests? Because one given test is not dispositive of natural selection. As noted with Tajima’s D, there are demographic processes of a stochastic nature which can produce false positives, so it is best not to live or die by one technique alone.

Here is figure 4, which shows the differences in allele frequencies on the EGLN1 gene:


We’ve seen EGLN1 before. In the figure above the left panels show the Andean derived SNPs, and the right panels the Tibetan ones. Note the differences in frequency in A and B. The red denotes statistically significant values for a statistic in panels C & D. Both Andeans and Tibetans show indications of selection, but the details in the patterns vary when you zoom in on the gene. The very last panel has an arrow which points to the SNPs in each population where the between population variance is maximized. Interestingly the ancestral allele seems to have risen in frequency here in the high altitude populations, as black denotes ancestral and red derived in the first and last panels.

Let me jump to their conclusion:

In summary, we performed a genome scan on high- and low-altitude human populations to identify selection-nominated candidate genes and gene regions in two long-resident high-altitude populations, Andeans and Tibetans. Several chromosomal regions show evidence of positive directional selection. These regions are unique to either Andeans or Tibetans, suggesting a lack of evolutionary convergence between these two highland populations. However, evidence of convergent evolution between Andeans and Tibetans is suggested based on the signal detected for the HIF regulatory gene EGLN1. In addition to EGLN1, a second HIF regulatory gene, EPAS1, as well as two HIF targeted genes, PRKAA1 and NOS2A, have been indentified as selection-nominated candidate genes in Tibetans (EPAS1) or Andeans (PRKAA1NOS2A). PRKAA1 and NOS2A play major roles in physiological processes essential to human reproductive success…Thus, in addition to demonstrating the likely targets of natural selection and the operation of evolutionary processes, genome studies also have the clear potential for elucidating key pathways responsible for major causes of human morbidity and mortality. Based on the findings of this study, it will be important to confirm the results with genotype-phenotype association studies that link genotype to a specific high-altitude phenotype.

I wanted to show the alphabet soup of genes in case you’re a geneticist with an interest in any of these loci. I’ve seen these before in previous papers, I assume the key that got this published in PLoS Genetics is the deep comparative dimension, as the researchers explored the lack or existence of evolutionary convergence between these two populations. Should the finding be surprising? I don’t think so. High altitudes are extreme environments, and the literature is filled with references to problems which emerge even in these populations because of the nature of their adaptations. There are likely deleterious side effects, especially if one of last spring’s papers on Tibetans is correct and that they’re relatively recent settlers of the highlands. But you never know until you play the game, so it is good to confirm.

A further exploration of the genetic architecture and nature of adaptations, especially when the research is extended to Ethiopians, may give us a further window into contingency in evolutionary history. These three occurrences are basically three independent experiments. In this paper they indicate that some of the variants being subject to natural selection may have been in the ancestral population, so standing variation. Others are new mutations, unique and novel. Though there are different pathways to the final expression of the phenotype, which in the details of implementation (physiology) still differ across the groups, there are also genes which in this comparison seem to be implicated in both Tibetans and Andeans as having been subject to selection. How constrained is the sample space subject to possible selection and the implied G-matrix? How contingent are the evolutionary pathways that different populations take to attain the state of adaptive fitness in similar ecologies? These are the sort of long term questions which I think will be possibly answered as the tentative silver age of altitude adaptation gives way to the golden age.

Citation: Bigham A, Bauchet M, Pinto D, Mao X, & Akey JM (2010). Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data PLoS Genetics

Image Credit: Micah MacAllen

Note: I am aware that classically the silver age follows the golden age, instead of precedes it. But we live in Whiggish times indeed!

Powered by WordPress