Razib Khan One-stop-shopping for all of my content

October 17, 2018

The Insight Show Notes — Season 2, Episode 4: Finnish Genetics

Filed under: science — Razib Khan @ 11:02 am

The Insight Show Notes — Season 2, Episode 4: Finnish Genetics

Midsummer in Finland

This week on The Insight (Apple Podcasts, Stitcher and Google Play) we discussed the prehistory and genetics of the Uralic peoples, with a particular focus on the people of Finland, who are among their most numerous exemplars.

We mentioned that the Uralic languages have a northern distribution, extending from north-central Siberia to northern Europe.

See for yourself:

We mention two recent papers of interest:

We discussed the past 20 years of debate on the origin of the TAT-C/N1c Y chromosomal haplogroup. This male lineage is found at high frequencies all across the northern fringe of Eurasia, and in particular among Uralic populations.

Here is an early paper on the topic: Genetic relationships of Asians and Northern Europeans, revealed by Y-chromosomal DNA analysis. If you want to know the origin of the name “TAT-C”, listen to the podcast! Spencer tells you.

There was a lot of discussion about Uralic culture. For example Kalevala and blood sausage. The eastern Baltic was also one region where farmers from Anatolia never migrated. See The genetic prehistory of the Baltic Sea region.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The Insight Show Notes — Season 2, Episode 4: Finnish Genetics was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

The expansion of the polar people

Filed under: Finland,History,science — Razib Khan @ 10:58 am

The expansion of the polar people

Sami in the far north of Europe

Since the development of agriculture 12,000 years ago, the cultural and genetic landscape of our world has been transformed by the emergence of peasants as the dominant demographic. For most of the recorded history, the average human was a peasant; a laboring tiller of the soil.

There were of course exceptions. Some peoples took up pastoralism. Others specialized in extracting resources from the sea — such as fisherman. And of course, there were hunter-gatherers who continued to practice a lifestyle as old as the human race itself.

Muskox in the Taimyr Peninsula

Though we often think of hunter-gatherers in a tropical context, the reality is that some of the most successful practitioners of this lifestyle have flourished in and around the Arctic. Not only have they flourished, but they have vastly expanded! For instance, the Thule culture of North America famously replaced the Norse agriculturalists of Greenland in the 15th century.

But perhaps the most speculator expansion of a non-agriculturalists in the north has been that of the Uralic peoples. A paper titled “Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations” has an excellent map which illustrates the geographic span of this language family:

Citation: Tambets, Kristiina, et al. “Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations.” Genome biology 19.1 (2018): 139.

Over twenty years ago researchers noted that one particular Y haplogroup lineage, N1c, was very common among Uralic peoples. Notice the overlap in distribution between this lineage and the Uralic populations below.

Distribution of N1c

The question then emerges: did the Uralic peoples come from the east, into northern Europe, or were they indigenous to northern Europe and expanded eastward? Examining patterns of genetic diversity indicate that this Y chromosomal lineage emerged in Siberia and later spread to northern Europe. Why? Because diversity accumulates in regions where the lineage has been present the longest.

Citation: Lamnidis, Thiseas Christos, et al. “Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry in Europe.” bioRxiv (2018): 285437.

New research from ancient DNA has clarified the timing of the arrival of these Siberians, Ancient Fennoscandian genomes reveal origin and spread of Siberian ancestry in Europe.

What we do know from modern genetic variation is that the Uralic people, including the Finns, seem to have recent Siberian affinities. In contrast, most other Northern Europeans do not have this — making it even more distinct. This Siberian affinity is strongest in the Sami hunter-gatherers of the far north.

Samples from a population in the Kola Peninsula of northern Russia from to 3,500 years ago yielded individuals who were even more Siberian than the Sami — as you can see in the admixture plot to the left. In particular, the Siberian ancestry of the Finnic people seems to be similar to that of the Ngananasn people of the Taymyr peninsula in Russia.

Looking at patterns within the genome of these ancient people, researchers have concluded that these people are the product of mixing between Siberians and indigenous European hunter-gatherers, which began to occur ~4,000 years ago. This aligns with other work that suggests that the Ceramic Comb Culture, the dominant Mesolithic hunter-gatherer society of northeast Europe before the expansion of agriculture, lacked Siberian ancestry.

Nenet Samoyed people

Where does this leave us? If we use genetics as a guide, it seems that around ~4,000 years ago a migration of Arctic hunter-gatherers swept out of the northern fringe of Siberia to the west. These people were likely related to the easternmost of modern Uralic peoples: the Samoyed tribes. The Y chromosomes of western Uralic peoples, such as the Sami and Finn, carry the hallmarks of ancestry similar to the Samoyeds. But the mitochondrial lineage is almost wholly similar to their European neighbors. Therefore, it seems that the spread of Uralic languages westward was due to the migration of males.

One of the implications of these conclusions is that the Uralic languages may have arrived in the Baltic after the Indo-European languages! In much of Estonia and southern Finland, the Corded Ware culture, presumed to be associated with Indo-Europeans, predates 2000 BC by centuries.

Though we often imagine that history and culture move in a singular direction, toward agriculture, the Uralic people may be an instance of an exception. If it is correct that hunter-gatherer Siberian men moved into large areas of northeastern Europe, and culturally assimilated more numerous peoples, some of whom were agriculturalists, it may indicate that the trajectory of history is more winding and complex than we may imagine.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The expansion of the polar people was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

October 11, 2018

Why PCA and genetics are a match made in heaven

Filed under: Evolution,Genetics,science — Razib Khan @ 8:13 pm
Insitome customers and selected populations

The image above is not the work of a small child trying to sketch out a B-2 Stealth Bomber. Rather, it is a PCA plot, which shows the distribution of a subset of Insitome’s customers who have purchased the Regional Ancestry Insight — in terms of how they relate to each other genetically.

In green, I have added some British individuals, in red some Africans from Nigeria, and in blue individuals who are ethnically Chinese. The majority of our customers are of Northern European heritage, but a substantial minority are African-American or Asian-American and various mixes therein.

So why do we use Principal Components Analyses, PCA, in the first place? And how does it work to matches our intuitions about relatedness through abstruse mathematical formulae?

Why we use PCA in genetics

Real genetic varition…a little bit

Consider this slice of diversity to the left. Six individuals, top to bottom, genotyped on a small number of genetic positions, left to right. You should recognize the letters, as they are DNA base pairs, A, C, G, and T. You can see above that there are variations between the positions across individuals. Now imagine attempting to gain insight from looking at thousands of individuals (rows) across hundreds of thousands of markers (columns).

Raw genetic data is basically just a huge text file. When you are concerned with the variation on a single position, you can view from the results for individuals or populations in a table and expect most people to immediately understand the implications. Europeans who are lactose tolerant have a variant on a particular marker. If you are TT or CT you can digest milk sugar, lactose, as an adult. If you are CC, you can’t. There are only one a few things to keep track of: the person, and their genotype.

Representing variation on a single marker, a single variable, isn’t necessary because the human mind can process all that information. In contrast, lots of simultaneous variables are impossible to understand just by visually looking at a table. PCA is just one of many excellent ways to extracting signal out of the noise.

The plot to the left was generated from ~30,000 markers on a few hundred individuals from eight populations. This is not a large dataset today. The time it took to run the function which generated the raw PCA result output was the period between me pressing “enter” on the keyboard and me looking at the computer screen.

And yet despite the modesty of this dataset can you imagine me looking at 30,000 variables across 200 samples, and obtaining any understanding? Perhaps if I devoted my life to the project!

What about the math?

The way it works mathematically is that it takes the voluminous raw data, which is totally incomprehensible to the human mind and summarizes it into a set of independent equations — making it completely essential to the analytical toolkit. The data is actually a “matrix.” PCA transforms it with a series of distinct equations which can define the total variation of the underlying data.

A matrix of genotypes

These equations, or more properly dimensions, are arrayed in order of proportion of variation in the data explained. On a conventional PCA plot, you see the first two dimensions, which explain the largest and second largest proportion of the variation, as the x and y-axes. But there are many more dimensions you can break the data apart by, though quite often for genetic analysis the largest ones are sufficient to smoke out the population structure that you are interested in. The values of individuals in each dimension that drops out of the data can then be placed onto a coordinate system, which is much easier to digest than a table of raw variation.

The branching of human populations

But how can a mathematical framework make biological variation comprehensible through maps so well — especially with regards to genetic differences between populations? The answer to this is straightforward: human evolutionary history has a pattern, and that pattern leaves its stamp on the genome. PCA is just a pattern extraction method.

The raw material of variation are mutations, and the pattern of mutations in any human genome is defined by a pedigree back to common ancestors. People who tend to share common ancestors share mutations — and mutations are the raw material for the genetic variation that PCA summarizes.

When used in evolutionary genetics, PCA should ideally recapitulate the phylogenetic tree. Assuming that sample sizes are balanced, humans in worldwide datasets have the first principal component of variation, which invariably a dimension that separates Africans from non-Africans.

Why? Because this is the earliest separation between large lineages, and so this ‘separation’ has had the most time to accumulate distinct and unique mutations in their two respective lineages. The second dimension is usually one that defines the difference between people from the Eastern portion of Eurasia and those from the western portion of Eurasia. Again, this is an important phylogenetic distinction because these two groups seem to have diverged soon after their ancestors left Africa.

And so on. PCA is not the only way to visualize the data. If you run a computer program that counted up raw similarities and differences between individuals at each genetic position, you would notice that some individuals are more similar to others, some groups more similar to other groups, and this too would reflect the phylogenetic history. If you had more time and wanted to dig deeper, you could construct various models of population history, and see how well the data fit those models.

PCA is not the only way to understand genetic variation. PCA itself is not the genetic variation, but a way to represent that variation, but it is a fast method that starts with few assumptions and lends itself to easy graphical representation. It’s not coincidence that it remains popular to this day.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


Why PCA and genetics are a match made in heaven was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

September 26, 2018

India is eternal but Indians are not

Filed under: History,India,science — Razib Khan @ 10:18 pm

This week’s episode of The Insight dug deeply into the current scientific understanding of the genetic origins of the peoples of the Indian subcontinent. Recent publications and media coverage have caught the science in midstream, as scholars have to deal with the clamor for new information in the face of the need to be careful and cautious when presenting new results.

Steppe Chariot

The show notes linked extensively to the scientific literature which documents the interface between cutting-edge genomics, modern population genetics and computation, and finally the abstruse lab science of ancient DNA. Or, just go to the preprint, The Genomic Formation of South and Central Asia.

The general outline of what we know so far is straightforward. Over the past 10,000 years, the Indian subcontinent has been a great vortex, sucking in peoples from various corners of Eurasia. The overwhelming proportion of the ancestry of any given person in the Indian subcontinent, from Punjab to Tamil Nadu, from the Arabian Sea to the Bay of Bengal, binds together the heritage of three peoples. First, the longstanding residents of South Asia who were descended from the original migrants out of Africa. Second, farmers and pastoralists from the hills of western Iran. And finally, Indo-Aryan peoples who arrived in chariots and drove their cattle before them.

Meenakshi Temple, South India

As noted on the podcast, the slippery and sometimes sloppy usage of labels can mislead as much as illuminate. The term “Indian” can refer to many things, whether it’s a geographic landmass, or, people. More esoteric but still widely used terms such as “Indo-Aryan” are properly linguistic, but they have gained ethnic connotations. A shorthand that communicates, and sometimes, distorts.

In some of the scholarly literature, and on the podcast, you may hear terms such as “Iranian farmer” without context. By this, we do not mean the farmers of modern Iran, but the people nearly 10,000 years ago who lived in what became Iran, and began to herd goats and grow wheat. These people then migrated eastward, eventually to India. Of the great farming cultures of the Middle East that arose with agriculture, these were the easternmost extension.

Obviously, the same caveat applies to the “steppe ancestry”, which is associated with likely Indo-European peoples, from the early Yamnaya to the successor Corded Ware, Andronovo and Sintashta cultures. The fact is that there were different peoples on the steppe before these cultures arose, and there were, and are, people on the steppe after they left the stage of history. But, in the context of Indian history what we mean by “steppe ancestry” are these particular cultures, and the genetic imprint we see on the steppe between the Volga and the Aral Sea, and later among the peoples of India after 2000 BC. The term is not genetic, but specific.

Indra atop his mount, an elephant

The latest genetic work aligns with earlier theories that the Indo-Aryans arrived in India after the decline of the Indus Valley Civilization. All signs point to their connection to peoples on the Eurasian steppe, whose origins are themselves a melange of West Asian, European and Siberian. This has led some commentators to suggest that the Indo-Aryans were “alien invaders.”

In sharp contrast, Indian nationalists have long been keen to point out that the earliest texts written down from the oral epics of the Indian Aryans do not seem to record a memory of a land outside of South Asia. In the Vedas, the oldest of the memories of the Indo-Aryan tribes, the Thunder God Indra sits atop an elephant, an Indian beast if there ever was one.

Though the origin of the Indo-Aryans was likely outside of the continent, it is important to remember that their cultural and historical identity as we understand them today seem to have been forged in the Indian subcontinent. The Vedas themselves bear the imprint of non-Aryans words, indicating that by the time the warlike and pastoralist tribes began to fashion the seminal epics which defined their identity, they had already become of the soil of the subcontinent in a deep sense.

Diversification of the Dravidian languages 4,500 years ago

One of the major dichotomies in the prehistory of South Asia on the edge of the history, from the arrival of Alexander the Great in the north to the Sangam period flourishing in the south, is between Indo-Aryan and Dravidian. Often, the Indo-Aryans are posited to be newcomers, while the Dravidians are aboriginals. But new research in linguistics and archaeology is pointing to the conclusion that Dravidian languages themselves diversified in the period after 2,500 BC. In other words, not very much earlier than when the Indo-Aryans arrived in the subcontinent.,

Though the Dravidian populations of the south often lack the ancestry from the Eurasian steppe, so common among Brahmins, in particular, they invariably show signs of being descended from the ancient Iranian farmers. Like Indo-Aryan speaking peoples, the Dravidians are themselves likely a fusion of newcomers from the north and west, and indigenous hunter-gatherers. The linguistic evidence, along with the start of the South Indian Neolithic in 2,500 BC, indicates that Dravidian-speaking peoples forded the path for the Indo-Aryans that came after them.

What genetics has told us over the past generation is that most of the world’s populations are mixes between very different groups of people. 10,000 years ago no one lived in the world who looked much like modern Indians. Or Northern Europeans. Or, likely Southeast Asians. And so on.

Underneath all the statistics, the new science and old history, the final truth is that in the game of precedence and indigeneity, no one really comes out ahead. It’s been a long and complicated dance between many different peoples, and everyone’s ancestry leads to both outsiders and insiders.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


India is eternal but Indians are not was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

The Insight Show Notes — Season 2, Episode 3: ANI, ASI, IVC and The Genetics of India

Filed under: Genetics,History,India,science — Razib Khan @ 3:49 pm
A scene from an ancient Indian epic

This week on The Insight (Apple Podcasts, Stitcher and Google Play) we discussed how the genetics of 25% of the world’s population, the people of South Asia, came to be. It’s a journey of thousands of years.

We cited the preprint, The Genomic Formation of South and Central Asia.

Additionally, we cite a chapter in David Reich’s Who We Are and How We Got Here, where he discusses the genetics of India, and how it’s analogous to Europe.

A cover story from India Today, 4500-year-old DNA from Rakhigarhi reveals evidence that will unsettle Hindutva nationalists, was also referenced. Please read with caution! The research has not been published, and there are likely going to be changes based on new results (actually, probably certainly from what I have heard)….

There was a discussion of some technical, but important, statistical genetic tests to infer admixture. The paper in Genetics, Ancient Admixture in Human History, outlines these methods in detail. The three and four population tests, as well LD decay estimates of admixture time are all discussed in this paper. All are alluded to or discussed in the podcast.

Linguistic families in South Asia

There was extensive discussion of the various language families in India, in particular, Indo-Aryan, Dravidian, and Munda. We discussed the results of a recent, paper A Bayesian phylogenetic study of the Dravidian language family, which indicates a recent expansion of this language family in South Asia. Also, a new preprint on Munda, The genetic legacy of continental scale admixture in Indian Austroasiatic speakers suggests that the Munda emerged around the same time as the Dravidians.

A lot of ethnographic terms were thrown around with deeper exploration. If you want to follow-up, Elamites from ancient Iran, Indo-European Sintashta culture, and the Bactria-Margiana culture of Central Asia.

We talked about ANI and ASI. The 2009 paper, Reconstructing Indian Population History, introduced these terms and constructs. The Kalash and Pulayar people of Pakistan and southern India respectively were mentioned as modern-day exemplars of ANI and ASI.

Distribution of R1a1a

The distribution of R1a1a in India and Eastern Europe was also discussed, and how it is associated with expanding steppes. Also, caste and its antiquity were discussed, in particular, that modern boundaries between groups seem to have emerged around 2,000 years ago, after several thousand years of admixture between disparate Indian groups. The promise of disease gene discovery in South Asia is a preprint that explores the relevance of this endogamy today for health risks.

Linguistic isolates Burusho and Nihali were mentioned. And, the development of the “Yankee” identity, which Razib analogized to Indo-Aryans!

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The Insight Show Notes — Season 2, Episode 3: ANI, ASI, IVC and The Genetics of India was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

September 19, 2018

The Insight Show Notes — Season 2, Episode 2: The Greatest Human Journey

Filed under: Genetics,hawaii,Podcast,science — Razib Khan @ 8:10 pm

This week on The Insight (Apple Podcasts, Stitcher and Google Play) we touched upon arguably one of the greatest human journeys of humankind, the expansion of the Polynesians across the Pacific.

Bishop Museum

Spencer discussed his visit to the Bishop Museum in Hawaii.

We discussed broadly the interesting confluence of biology, geology, and history one can see in Hawaii. The book The Monkey’s Voyage: How Improbable Journeys Shaped the History of Life discusses the biogeographic characteristics of many islands, including Hawaii.

We discussed the context of Polynesian languages and culture as part of the broader zone of Austronesian language and culture.

The extent of Austronesian languages

Austronesian societies spread over the last 6,000 years from Taiwan to the far west in Madagascar, and far east in Easter Island. The expansion into Polynesia was prefigured by the expansion of the Lapita culture between 1500 BC and 500 BC.

The Lapita culture is defined by its unique pottery. But curiously the usage of pottery disappeared among the Polynesians, the likely later descendants of the Lapita people. Razib mentioned how there is some evidence that cultural bottlenecks and small populations can result in loss of skills such as pottery.

On the other hand, Spencer pointed out that the Polynesians also did not practice rice agriculture, unlike other Austronesian societies. Instead, they expanded with a cultural toolkit of taro, which likely was adopted from the peoples of Near Oceania, New Guinea, and Melanesia.

Sweet Potato

Additionally, Spencer brought up the fact that the cultivation of sweet potatoes in Polynesia likely indicates contact between Polynesians and the peoples of South America. The genomic evidence that Polynesian sweet potatoes derive from South American ones is conflicted. Spencer mentioned that the word for “sweet potato” in Quechua, the language of highland Peru, is kumar. In Hawaiian, it is ku ala.

We mentioned in passing Thor Heyerdahl’s view that there was a South American migration to Polynesia. But the genetic, cultural, and archaeological evidence does not support this.

The Polynesian mtDNA motif was mentioned. With a high frequency in Polynesia, the mtDNA lineage seems to have spread from the west, in line with the idea of a migration to the east. In contrast, the Polynesian Y chromosomes show a mix of Asian and Melanesian heritage.

Much of the arguments hinge on the argument of whether the expansion of Austronesians into the Pacific was via the “slow boat” or “express train” model. The slow boat model suggests widespread cultural and genetic mixture gradually with the Austronesian expansion through Melanesia. The express train model implies a more rapid migration with far less interaction. Culturally the adoption of taro cultivation aligns with the slow boat thesis. As does the existence of Melanesian Y chromosomes across the range of Polynesians. But the overwhelming Asian nature of Polynesian mtDNA lineages fits the express train model.

One way that scholars have reconciled this is that there was a slow expansion of the Lapita people, but that they only assimilated Papuan and Melanesian men into their matrilineal communities. This broad framework was reinforced with the publication of genetic results from native Hawaiians, which showed a minority ancestry from a Papuan-like population.

But wait, there was a twist! Ancient DNA now shows that the Lapita people had almost no admixture with Melanesian people! Follow-up results from Vanuatu and Tonga confirm that the Lapita people had no admixture from Melanesians. Rather, in Vanuatu 2,500 years ago the Lapita people are replaced by an almost entirely Melanesian population, and the Melanesian ancestry begins to show up in Polynesians after this period. The conclusion then is there were multiple migrations into Polynesia!

Spencer and I concluded that the broad sketch is now established, but a lot of complicated details need to be worked out. Instead of express trains or slow boats, some researchers now wonder if Polynesia was more like a subway network.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The Insight Show Notes — Season 2, Episode 2: The Greatest Human Journey was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

Hawaii: complicated a journey to paradise

Filed under: anthropology,Genetics,hawaii,science — Razib Khan @ 7:11 pm
The extent of Austronesian Diaspora

Ask any American what they think when you say the word “Hawaii,” and certain words will no doubt reoccur from person to person. That’s because certain images, feelings, come to mind. A gentle breeze, beaches, and volcanoes. The 50th state has been the byword for paradise on the mainland. A certain sense of Hawaii is part of American popular culture.

But Hawaii is a real place with real people. It isn’t a dreamland. Rather, it is one of the most isolated large islands in the world. Over 2,500 miles from the nearest continent, there is only a single terrestrial mammal native to the islands: predictably, a bat!

Obviously, the island is crawling with mammals today. Nearly 1,000 years ago voyagers from the lands of the western Pacific landed on the Society Islands, which includes famed Tahiti, and then sailed northward to the Hawaiian archipelago. When the ancient Polynesians settled Hawaii they did not arrive alone. They brought with them pigs, chickens, and dogs. Naturally, rats tagged along as unwanted passengers.

Humans arrived in Hawaii in catamarans

But the settlement of Hawaii by humans was the end of a long journey which began thousands of years earlier in the mists of prehistory. Six thousand years ago a small group of stone-age seafarers, who we call Austronesians, journeyed south from Taiwan and settled the northern Phillippines.

But they did not stop there. Over a period of thousands of years, these ancient mariners spread out over Southeast Asia, sometimes introducing intensive forms of rice agriculture and their distinctive language. But they did not stop there. For whatever reason, these were a people who wondered what was over the horizon, even if it was the deep blue ocean. They moved on west and east. Over 1,000 years ago their descendants reached the western Indian Ocean, mixing with the Bantu farmers of eastern Africa and occupying the island of Madagascar. In the other direction, Austronesians moved into Oceania, abandoning rice and adopting taro from Melanesians. Less than 1,000 years ago the Pacific expansion finally crested, as Polynesians settled in New Zealand, off the coast of Australia, Easter Island, 2,300 miles west of South America. And of course, they ventured north to Hawaii, an isolated ecologically rich and unique jewel in the midst of the Pacific.

In Southeast Asia, the Austronesians merged with native populations of farmers which migrated out of southern China earlier. But as they moved west and east they encountered very different populations, whether it be African farmers and pastoralists, on the one hand, or Melanesians in the case of the ancestors of the Polynesians.

Citation: Kim SK, Gignoux CR, Wall JD, Lum-Jones A, Wang H, Haiman CA, et al. (2012) Population Genetic Structure and Origins of Native Hawaiians in the Multiethnic Cohort Study. PLoS ONE 7(11): e47881

And just as the people of Madagascar, despite speaking a language closest to those spoken in Borneo, have a blended with nearby populations. Polynesians carry signatures of interactions with the peoples of Near Oceania, which includes New Guinea, Australia, and Melanesian islands in the western Pacific, such as the Solomon Islands and New Caledonia.

As genomics began to illuminate all the relationships between human populations, in 2012 a paper was published that surveyed the genomes of many native Hawaiians. The results were clear: the indigenous peoples of Hawaii had a dominant signature of ancestry shared with mainland Asian peoples, but also a minority component that had more affinities with the peoples of Near Oceania.

Lapita culture sites

This result was relevant to what traditionally had been termed the “express train vs. slow boat” models of the settlement of Polynesia. The “express train” hypothesis implies that the Austronesian Lapita culture rapidly pushed out of maritime Southeast Asia, with minimal interaction with local Papuans and other Melanesians. In contrast, the “slow boat” model meant that the expanding proto-Polynesians mixed with Papuans and Melanesians as they spread eastward more gradually, creating a fused culture which pushed onward into the far Pacific.

The results above, along with maternal and Y chromosomal lineages seem to support the “slow boat” model. Not only are all Polynesians, including Hawaiians, descended from Southeast Asian farmers, but their ancestors also include the people who first pushed to the edge of the Pacific. These were the ancestors of Oceanians who settled New Guinea, Near Oceania, and Australia more 40,000 years ago with the first “Out of Africa” migration.

Citation: Skoglund, P., Posth, C., Sirak, K., Spriggs, M., Valentin, F., Bedford, S., … & Fu, Q. (2016). Genomic insights into the peopling of the Southwest Pacific. Nature, 538(7626), 510.

So case closed? Not exactly. Science and history are often more complex than our elegant human imaginings. Over the past few years, the field of ancient DNA has come upon the scene to disturb hypotheses and provoke the development of new ones. Now researchers can see snapshots of the past with much crisper detail than would have been the case in the past.

Two papers have helped reshape our understanding of the peopling of Polynesia. First, a 2016 paper showed that samples of ancient Lapita people don’t show any admixture from Melanesians. This is in accordance with the “express train” model, which the genetic heritage of modern Polynesians presumably refuted!

An immediate solution to this conundrum is that the old models were too simple. That there wasn’t just a simple migration outward, but rather several, and that Melanesian ancestry arrived later. Within the last 2,000 years.

A paper published in 2018 added more nuance and clarity to what may have been going on. Today the island of Vanuatu is considered to be Melanesian and is settled by people of predominant Oceanian heritage. But ancient DNA from 3,000 years ago yielded individuals of nearly total Asian heritage. But by about 2,000 years ago these people were replaced, by the ancestors of modern Melanesians, as later samples show overwhelming Oceanian heritage.

Poke is a melange of flavors and ingredients from the four corners of the world

Where does this leave us? Appropriately, a paper appeared with the title “Human Genetics: Busy Subway Networks in Remote Oceania?” was penned as a response to all this uncertainty and confusion. The title says it all, doesn’t it?

These findings may actually be consonant with recent archaeological results that eastern Polynesia and New Zealand were subject to a massive demographic expansion and radiation beginning around ~1,000 years ago.

Today modern Hawaii is a melange of peoples, reflected in its cuisines, such as Poke, which has been inflected and modified by new ingredients brought by immigrants from the mainland and Asia. And yet perhaps this was always so, as paradise was never as serene and eternal as we may dream in our imaginings. Rather, Hawaii and the Hawaiians were products of daring voyages generation after generation, and the waxing and waning of peoples and cultures, bringing together diverse and disparate threads of the human expansion out of Africa.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


Hawaii: complicated a journey to paradise was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

August 28, 2018

The dual engines of modern science

Filed under: science — Razib Khan @ 9:38 am

A few years ago Armand Leroi wrote The Lagoon: How Aristotle Invented Science. Some people immediately made a critique that actually, science, as we understand it, is really the creation of early modern Europe. That Aristotle and his fellow Ancients, or physicians and astronomers of early medieval Islam, or the scholastics of the high Middle Ages, didn’t “really” do “science.”

I think most of us understand where this critique is coming from. But, even if you grant the objection if Aristotle was alive today, would he be a scientist? Of course, he would go into science! And, he would probably a good one. Perhaps a great one. Why? Because he had the curiosity, cognitive skills, and, there is a culture that would allow him to flourish. To me, the biggest difference between early modern Western science, as it emerged in the 17th and 18th centuries, and what came before, is that it was a cultural concert of thinkers, a vast constellation of minds and minions.

In contrast, much of ancient science was driven by singular geniuses.

This brings me to the massive replication effort that just got published in the journal Science:

There are lots of angles to this story. Mostly good. But Jonathan Haidt pointed out how important this makes collaboration and a culture of truth-seeking within the enterprise. Alexandra Elbakyan has stated that her scientific activism is driven by “communist ideals.” And though I dislike Communism, I do think there is something fundamentally communistic about science. In Uncontrolled Jim Manzi points out that within the world of science there are very strong norms about honesty. A major issue with scientific fraud is scientists are trusting.

But then there is the von Neumann factor: geniuses can accelerate and open up whole landscapes of research. They do a “different kind of science.” It’s less culturally embedded, and less social and incremental. They are the sparks which fly in the darkness.

The moral of the story, if there’s any, is that modern science is a synthesis of these two aspects. There is the “industrial” aspect of scale, efficiency, and incrementalism. One step at a time into the darkness, cautious and continuous.

And then there are the startling breakthroughs. Sometimes those breakthroughs are genius and insight. Consider the story of the emergence of String Theory outlined in Lee Smolin’s The Trouble with Physics. Smolin is a skeptic of String Theory, but in the book, he describes how rapidly it took the scientific world by storm, just by force of its insight and elegance.

Then there are cases such as CRISPR, where several different groups seem to have “stumbled” onto it. The genius here is less in the humans than in what nature had invented. Nevertheless, in a few years, CRISPR radically transformed the possibilities in “genetic engineering.”

Going forward, big collaborative science will keep lumbering on. It will play the role that it has played for decades, driving translation, laying the seedbed for innovation. Normal science. But every now and then a spark will fly, and a new flame will explode. Genius still has a role to play.

August 15, 2018

The Insight Show Notes: Episode 32, So you want to be a geneticist…

Filed under: anthropology,Archaeology,Genetics,science — Razib Khan @ 5:45 pm
Drosophila

This week on The Insight (Apple Podcasts, Stitcher and Google Play) we talk to an “early career” geneticist, Austin Reynolds. A graduate of Indian University and University of Texas-Austin, he is currently a post-doctoral fellow at University of California-Davis.

Alfred H. Sturtevant in his own “fly lab”

As a field, genetics is officially a bit over a century old. Though Gregor Mendel made his key discoveries fifty years before. Since the year 2000 genetics has undergone a revolution driven by sequencing technology and more powerful computing. Around 2010, a different revolution began, which Austin has been a part of, involving the synthesis of archaeology and genetics with the field of ancient DNA.

The first ancient whole-genome analysis, Ancient human genome sequence of an extinct Palaeo-Eskimo. Also, the Neanderthal paper which revolutionized our understanding of our relation to this lineage.

An excellent review of the state of the current research, Ancient Human Genomics: The First Decade. And a preview of the future, Tales of Human Migration, Admixture, and Selection in Africa.

David Reich’s book Who We Are and How We Got Here is a good primer on ancient DNA and population genetics. Highly accessible to the lay audience without sacrificing any of the scientific content.

Loci associated with skin pigmentation identified in African populations.

Nuclear DNA sequences from the Middle Pleistocene Sima de los Huesos hominins.

On career issues, Track the fate of postdocs to help the next generation of scientists.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The Insight Show Notes: Episode 32, So you want to be a geneticist… was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

July 18, 2018

The Insight show notes: Episode 29, The Genetics of China, Han & Beyond

Filed under: China,Genetics,History,science — Razib Khan @ 3:39 pm

This week Razib and Spencer discussed the genetics and history of China on The Insight (iTunes, Stitcher and Google Play).

Chinese history looms large in the podcast, and there are many books one can read on the topic. In particular, John King Fairbank’s China: A New History is one of the rest comprehensive treatments. To understand what’s going on in China today it’s probably good to have at least one survey book or course of its past under your belt!

For the purposes of this episode though, you can just check out a list of Chinese dynasties, if you just want a visual outline of the timeframe and period which Razib and Spencer covered in the podcast.

In relation to the genetics alluded, for genome-wide patterns of relatedness across Chinese regions: Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation. This 2009 paper uses 350,000 markers from 10 provinces to perform exploratory analysis of genetic structure within China.

More recently, A comprehensive map of genetic variation in the world’s largest ethnic group — Han Chinese, is a preprint that utilizes whole-genome sequencing to assemble an even larger dataset.

For maternal mtDNA, Large-Scale mtDNA Screening Reveals a Surprising Matrilineal Complexity in East Asia and Its Implications to the Peopling of the Region. For Y chromosomes on the paternal side, Y Chromosomes of 40% Chinese Descend from Three Neolithic Super-Grandfathers.

To get a sense of how China’s population has grown genetically, see Robust and scalable inference of population history from hundreds of unphased whole-genomes. The figure to the left shows the “Out of Africa” bottleneck, and then demographic expansion in the last 50,000 years. “CHB” represents Chinese sampled in Beijing. Along with “GIH”, who are Gujuratis, and “CEU”, a Northern European American cohort from Utah, the Chinese exhibit explosive growth in the last 10,000 years.

There is extensive discussion of the environment and geography of China, and how it related to agricultural expansion and migration southward. The Retreat of the Elephants by Mark Elvin chronicles this process of the expansion of rice farming into the jungles of southern China through natural history and human geography.

Though most people are aware of the Mongols, fewer are cognizant of the interregnum between the Han and Sui-Tang, when many steppe nomads settled in China, Buddhism took root, and many elite Han lineages migrated from the north to the south. For those curious about this period, China Between Empires: The History of the Northern and Southern Dynasties is an excellent introduction accessible to all.

Finally, there was extensive discussion about the future of Chinese science. For a deeper exploration of that that, see A Chinese Province Is Sequencing One Million of Its Residents’ Genomes and China Has Already Gene-Edited 86 People With CRISPR.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The Insight show notes: Episode 29, The Genetics of China, Han & Beyond was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

June 28, 2018

Have your exome sequenced for $29.99

Filed under: DNA,Genetics,sales,science — Razib Khan @ 9:11 pm

Just a reminder that for the rest June Helix DNA kits with the cost of an Insitome app. Buy Regional Ancestry, Metabolism, or Neanderthal, and start your lifelong DNA journey with Helix for just $29.99! A great gift idea.

That means for the cost of an app Helix will sequence more 30 million markers in your genome. In contrast, rival genotyping companies only look at 500,000 to 1,000,000 markers.

Cost of use: FREE

The 30 million markers Helix sequences include your whole exome. The part of your genome which is involved in coding for proteins, and so impact your appearance and function. The Helix system also includes markers outside of the exome to further map your genome more effectively.

Insitome’s apps, whether it be Regional Ancestry, Metabolism, or Neanderthal are windows into the whole landscape of modern day personal genomics. Once Helix sequences you the data is banked for later use.

When new apps are developed on the Helix platform, your future purchases will only include the cost of the app! Entering the ecosystem now means that you will never have to pay the initial cost of the sequencing kit.

What are you waiting for? Get the Helix DNA kit and jump into the ecosystem now!


Have your exome sequenced for $29.99 was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

June 27, 2018

Genetics of uniqueness

Filed under: Genetics,indigenous-people,science — Razib Khan @ 1:54 pm
Ati woman from the Philippines
Hui Chinese Muslim man

True genetic isolation is hard to pull off. Human populations tend to mix when they are in close proximity.

Consider the Hui people. These are Muslims who live across China and speak the local Chinese dialect of their locale. The Hui claim descent from Central Asians and Persians who arrived in China around 1,000 years ago. But the vast majority of their genomes are no different from the Han Chinese. Physically they are impossible to distinguish from Han Chinese unless you take note of their attire.

How can that be when they are so culturally different? For example, as Muslims the Hui do not eat pork and consider it unclean. In contrast, for the majority Han pork is dietary staple.

Imagine that the Central Asian ancestors of the Hui arrive 1,000 years ago to China. The historical record suggests that is roughly correct. Each generation is 25 years long, so that’s 40 generations. Since the population of Muslims is small in comparison to the native Chinese, we can ignore the latter, while focusing on the former. If on average 1 out of 20 marriages was between a Han Chinese and a Muslim within the Hui community per generation, after 1,000 years 88% of the ancestry within the Muslim community would be traceable to Han Chinese ancestors. Even though in each generation the overwhelming majority of marriages were within the Muslim community, over time the genetic distinctiveness of the Muslims would diminish.

The lesson is that even a small degree of intermarriage can even out the differences between groups. Similarly, in population genetics one individual moving between two groups per generation is enough to prevent them from becoming distinct. In small populations, which diverge fast, one individual is a substantial proportion of a population. In large populations the divergence is going to be much slower, so even one individual is enough.

An Andaman Islander

So how do populations remain genetically distinct if mixing and homogenization is so easy? The simplest way is simply geography. Consider the Andamanese. These slim and dark-skinned people are the natives of the Andaman Islands, in the middle of the Bay of Bengal. To the knowledge of archaeologists and historians these people have been hunter-gatherers since time immemorial. The only verified continuous such tradition in all of Asia.

The Andamanese likely arrived in the islands during the Pleistocene, when sea levels were lower, and the Andaman Islands were much more accessible from the Southeast Asian mainland. But over the past 10,000 years, as much of the world adopted agriculture, and population turnover occurred in South Asia and Southeast Asia, the Andaman Islands remained relatively untouched due to their isolation.

But it wasn’t just geography. Over the past 2,000 years the Indian ocean has become a major thoroughfare of trade and travel. The Andaman Islands were on a route between India and Southeast Asia. Because of this fact they were often a convenient stopping point to refresh water supplies. But these traders never settled the islands. The local people had a habit of attacking any vessel which tarried too long in their waters.

Pygmies from Central Africa

Unlike many animals humans have complex and evolving cultural practices. The Andaman Islanders discouraged contact with outsiders by maintaining a savage and hostile reputation.

But other groups have remained genetically distinct through symbiosis rather than separation.

The Pygmy peoples of Central Africa are distinguished from their neighbors by their small stature, and hunter-gatherer lifestyle. But they invariably speak the languages of their neighbors. Anthropologists have observed that Pygmies and the farmers who they live nearest to seem to exist in some form of interdependence. Hunter-gatherers can obtain resources from the deep rainforest inaccessible to famers, while the farmers offer the Pygmy people goods which they themselves could not produce.

African farmers and hunter-gatherers have lived in close proximity for over 2,000 years, and yet the Pygmies remain different physically and genetically from their neighbors. Some mixing has occurred, but the Pygmies are as much a separate caste as a different people. Their lifestyle is so different that farmers and Pygmies view each other as profoundly alien and peculiar, despite speaking the same language and occupying nearby geographical space.

Roma in Romania

Isolation then can be both a physical and psychological phenomenon. Some groups, such as the Andamanese, are physically separated from other humans. They add cultural adaptations which reinforce this separation. Others, such as the Pygmies, or the Roma of Europe, are culturally very distinct, and occupy a specific role in the social ecology of their region. In both cases the isolation is strong enough to result in genetic differences between populations of the majority and the isolate.

In many cases these populations are not so isolated in the modern age. In the Andaman Islands most of the tribes now interact with settlers from the Indian mainland. Only the people of North Sentinel island remained truly isolated and cut off from the rest of the world. Meanwhile, the Pygmy people of Central Africa have been caught up in the massive civil wars that have wracked that region of the world since the 1990s. In other cases, as with the indigenous Negrito people of the Philippines, their biological and cultural assimilation into the dominant Austronesian mainstream is proceeding to such an extent that they may no longer being a distinctive people by the end of the 21st century.

For many peoples the 21st century will be the twilight of their solitude, as they merge into the world.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


Genetics of uniqueness was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

June 20, 2018

Three drinks for the ages

Filed under: Alcohol,Coffee,Genetics,milk,science — Razib Khan @ 9:33 pm
Irish Coffee

The “Irish coffee” is a a delicious concoction. Coffee, alcohol, and dairy. What more can you ask for? Man does not live on bread and water alone. Cafes and bars are thick on the ground in large cities, but also grace country roads. Coffee and alcohol are congenial to conviviality among settled peoples, while milk is the staff of life for many pastoralists, consumed raw or turned into cheese.

\

Of the three, coffee is a new on the scene, discovered within the past 1,000 years. The consumption of milk, whether raw or as cheese, goes back to prehistory. But on the geological scale it a recent cultural development. In contrast, the imbibing of alcohol in some form is probably as old as humanity itself, albeit not as a pint in the pub.

Alcohol is produced naturally by the fermentation process, a metabolic pathway which is far more ancient than the oxygen metabolism that has been dominant for the past few billion years. Humans are omnivores, and our ancestors consumed overripe fruit which had fermented to the point of producing alcohol byproduct. Meanwhile, “good bacteria” in our guts also produced alcohol.

This is not a bad thing. Alcohol is nutritious in that it provides calories.

Though in modern societies we “count our calories”, and the richness of a deep and dark beer is not always a selling point, for the vast majority of our species’ history those calories were a feature, not a bug.

Early civilization ran on beer. The Sumerians even had a goddess of beer, Ninkasi. The workers who built the pyramids of Old Kingdom Egypt were given rations of beer. In other words, the wonders of the ancient world were fueled by alcohol!

And this is not just forgotten history. Until very recently much of the world was awash in alcohol, whether it be beer, wine, or various distilled spirits. Public and private drunkenness were one of the major reasons behind the emergence of the American “temperance” movement. Though Prohibition was deemed a failure, American alcohol consumption has never recovered to its earlier highs.

One of the reasons that Americans, and many other peoples, drank so much is that alcoholic beverages is that not only did they provide calories, but they were often more potable than conventional water. Ancient humans in hunter-gatherer bands did not have to contend to cholera, but the first village societies, and those who lived in early modern cities, lacked modern sanitation. Safe drinking water was one of the major achievements of 20th century engineering, and obviated the role that alcohol had traditionally played in quenching the thirst of the common man.

But alcohol is not a matter just of history, biochemistry and engineering. Humans differ in their ability and capacity to metabolize alcohol due to variation on their genes. In particular, ADH and ALDH2. The ADH genes produce enzymes which breakdown alcohol for processing by later biochemical steps, one of which is catalyzed by the product of the ALDH2 gene.

If you’ve ever seen someone with the flushed face characteristic of having had too much to drink, they may have a mutation on ALDH2 which means that they don’t process acetaldehyde very well. As the cells build up acetaldehyde, a host of physiological reactions kick in. Research has shown that those who exhibit these reactions are much less likely to be alcoholic.

In contrast, those with mutations on ADH tend to process alcohol very well indeed. But in the process they produce more acetaldehyde than the body can handle, resulting in physical discomfort. And similarly to the ALDH2 mutation these individuals are less likely to become alcoholic.

Genetic variation in the ability to process alcohol is a consequence of the long history of human omnivory. In contrast, the evolutionary history around our consumption of milk is much more straightforward and strange. For the vast majority of our species’ existence adults have not had the ability to digest milk sugar, lactose. This is a characteristic we share with all other mammals. The adaptive reason for this is likely that it encourages and forces weaning, so that mothers can bear other offspring.

And yet a minority of modern human adults today can digest milk. How? Why? The LCT gene produces an enzyme lactase, and mutations in this gene allow humans in Europe, parts of Southern Asia, East Africa and the Near East to continue to drink milk into adulthood. Over the past 5,000 years unique mutations in Europe and South Asia, in Arabia, and in Africa, have all been strongly selected.

In Denmark the mutant allele is now at frequencies as high as 90%.

Ancient DNA tells us that the ability to digest milk sugar into adulthood did not arise with agriculture and sedentary lifestyles. It is not implausible that Neolithic people who domesticated goats and sheep fermented milk to produce cheeses, where the sugar was broken down to make it more palatable. But the adaptation to a predominantly dairy dependent lifestyle only emerged with full-blown pastoralism, over the past 4,000 years. The earliest pastoralists on the Bronze Age Eurasian steppe carried the lactase persistent genetic variant, but only at low frequencies.

Dairy is an essential part of the modern food pyramid, at least for the USDA. But perhaps it tells us more about our evolutionary present than the evolutionary past. So often we talk about evolution as a dynamic of the deep past. But with lactose tolerance we see evolution as a process which is just initiating.

Finally, there is coffee. Though variation on the CYP1A2, Cytochrome P450, effects how fast caffeine is metabolized, coffee is such a recent cultural invention that it is unlikely that there are any adaptive dynamics related to it on a genetic level. Rather, CYP1A2 is locus which controls processes designed to cope with toxic chemicals by breaking them down. Caffeine in some ways is such a chemical, and those who metabolize it fast need to drink more coffee to feel its effects than those who have more efficient metabolization.

The effect of caffeine on humans is literally inefficiencies of bodily detoxification.

Milk nourishes. Alcohol both nourishes and alters the mental state of those who imbibe it. In contrast, caffeine does not nourish, but stimulates. For the past few million years our species likely never interacted with caffeine, but we were pre-adapted because of our consumption of a wide range of plants which manufacture chemical defenses.

The legend of coffee dates back 1,000 years, when an Ethiopian goatherd saw one of his animals behave strangely after eating a coffee plant. Within the next five hundred years coffee beans were cultivated across the hillocks of the lands around the Red Sea, from Ethiopia to Yemen, and became part and parcel of Islamic culture. To this day the coffeehouse is a major social and cultural nexus in the Middle East, though colonialism has taken it far afield, from Java to Colombia.

By the Renaissance coffee had reached Europe, and the proliferation of coffeehouses, and their stimulative effects, may have triggered the early modern Enlightenment intellectual revolution. While alcohol softens and dims the outlines of world around you, coffee is a stimulant which sharpens our perceptions and accelerates our cognitive pace.

Coffee, alcohol, and milk, are such central aspects modern culture that it is hard to imagine our existence without them. Though there is genetic variation in how we can process them, their relevance to our lives transcends biology, and extends to economics, history, anthropology, and in the case of wine, religion. Though they may not be the ambrosia of the gods, modern civilization arguably stands on the shoulders of these beverages.

Wondering if you are lactose tolerant based on your genetics? Check out Metabolism by Insitome.


Three drinks for the ages was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

May 25, 2018

Arise the coalescent!

Filed under: Biology,Evolution,Genetics,science — Razib Khan @ 12:14 pm
Citation: Modeling Human Population Separation History Using Physically Phased Genomes

Evolution is sometimes difficult to comprehend in terms of how it plays out in your mind’s eye. This is different from believing that evolution occurred. Evolutionary ideas were in the air when Charles Darwin and Alfred Russell Wallace both developed a theory of morphological change and speciation driven by adaptation in the middle of the 19th century. Their genius was introducing natural selection as the motive force underlying the change. But both of these thinkers lacked a true mechanism of heredity, so the formal extension of the field was hobbled.

With the emergence of genetics in the years after 1900, evolutionary science developed into a new and powerful form, what we now call the “Neo-Darwinian Synthesis.” This project combined the descriptive richness of natural history, the explanatory power of classical conceptual Darwinism, and the formal precision of population genetics.

The Neo-Darwinian Synthesis rests to a great extent on population genetic models. The most elementary of those models is that of the Hardy-Weinberg Equilibrium (HWE) — a large random mating population not subject to selection or drift. Deviations from the conditions of these models allow us understand the processes that are occurring in specific populations.

In the lab, researchers use matings between organisms such as Drosophila that deviate from the assumptions underneath the models, and see what the outcomes are. Scientists mate together flies with similar or dissimilar traits, violating random mating. They select individuals based on their characteristics, or collapse reproductive pedigrees down to a family lineage to explore inbreeding, introducing selection and random genetic drift.

But laboratory research can be both time intensive and tedious. With the rise of powerful computing tools in the last half of the 20th century scientists realized that they could simulate outcomes of their models. Just like in an experiment, researchers could change the conditions, the parameters, and see the results to the final outcome!

State of the art simulator, 1985

In the beginning, the power of simulations and computing seemed almost magical to researchers. No more time intensive sampling in the field, or expensive construction of laboratory facilities.

But over time, they began to realize that simulations also have their limitations. Computer memory and disk space costs money too, and scientists quickly found that the law of scarcity was not abolished. They couldn’t explore infinite possibilities because infinite took forever, even in a computer.

Imagine that you start with a few hundred individuals and simulate them randomly “mating.” You stipulate that their population grows 2% every generation. After a 100 generations, your population size is 10 times larger. The possible number of “mates” in your program is now 10 times greater, and there are so many more possible interactions. Anyone who has tried to work with large files knows that computing resources are finite, and simulations running forward in time run into the limits of that finitude soon enough.

But what if you moved back in time? Imagine you began with 10,000 individuals, and traced the ancestors of these 10,000 back across the generations. Genealogies can be complicated. But consider a single gene copy in your body, and compare it to another copy in another person. At some point in the distant past, the two copies share a common ancestor — they coalesce.

The coalescent sounds science fictional, but really it’s just a way to work backward from the genetic data you have now, to the past. You can create a tree of relationships back into the distant past, reversing direction with a genetic time machine. And the beauty of the coalescent from the perspective of 2018 is that computationally it is much more feasible to work back in time. With each step, you have fewer and fewer branches in the genealogy to model — back to a single common ancestor.

Instead of being overwhelmed by computational tasks, the coalescent converges upon the elegant simplicity of the last common ancestor, bring together late 20th century mathematics, 21st century computing, and the original conceptual insight of Darwin and Wallace of common descent.

Explore your Regional Ancestry story today.


Arise the coalescent! was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

May 16, 2018

The tribe more diverse than all of Asia!

Filed under: Africa,Genetics,science — Razib Khan @ 1:17 pm
Bushman

In 2010, a paper, which sequenced the whole genome of Bishop Desmond Tutu, revealed that the San Bushmen of South Africa show more genetic difference between two men from different tribes than the differences between a European and an East Asian. In other words, two San Bushmen men from different populations within South Africa are more genetically distinct than a Chinese person is from a British person.

A follow-up paper from 2014 revealed that over the course of human history, the San Bushmen in fact had the “largest population” of any modern group. This seems surprising and ridiculous. There are over 1 billion Han Chinese, and only 100,000 San Bushmen.

Citation: Khoisan hunter-gatherers have been the largest population throughout most of modern-human demographic history

How can this be?

The first thing to keep in mind is that we’re talking about genetic diversity. When you look at the genome of a San Bushmen individual, it’s a lot more genetically diverse than that of a Han Chinese individual. A typical San Bushmen has more than 4 million genetic variants (SNPs), while a typical Chinese has only over 3 million genetic variants. This difference reflects population history.

One of the major keys to solving this mystery is to remember that the “Out of Africa” migration imposed a bottleneck on all non-African populations. That means that 50 to 100 thousand years ago, a small group of humans were the ancestors of all groups outside of Africa. All non-African populations exist in the shadow of this bottleneck, from the over 1 billion Han Chinese to a few hundred tribesman in the Amazon.

In contrast, the ancestors of Africans did not experience such a bottleneck.

The number of genetic variants that Africans carry did not decrease due to an “Out of Africa” bottleneck, and of all the people of Africa, the San Bushmen seem to have occupied a wide zone of southern Africa in their current state from an immemorial time. This stability has left an imprint on their genome, which is more genetically diverse than any other human group.

If you could use a time machine to count the number of people in the groups of humans which gave rise to the San Bushmen, they would always be larger than the small migration “Out of Africa.” This is why a tribe of San Bushmen have more genetic diversity than billions of Asians!

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!


The tribe more diverse than all of Asia! was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

May 9, 2018

The “X” in the sex chromosome

Filed under: Genetics,Genomics,mothers-day,science — Razib Khan @ 3:48 pm

There are ~3 billion base pairs in the human genome. Of that ~5% are in the X chromosome. The X is fully functional, unlike the famously hamstrung Y. It harbors one of the longest genes in the human genome, DMD, at 2,300,000 base pairs. In contrast, the human Y chromosome only has 72 protein coding genes! (it’s perhaps no surprise that, aside from sex determination, many of these genes are involved in things such as spermatogenesis)

And yet it is the Y chromosome which gets full treatment in popular science books. Like the C student who receives praise for a B-, the Y chromosome is given high marks simply for doing a few things here and there, most especially its role in driving the emergence of biological males. But the reality is that males would not be viable if it wasn’t for the X.

Can you see that it says 74?

Because the Y chromosome is so handicapped, filled with repetitive “junk DNA,” the heavy-lifting is shifted onto the single X that males carry. Though the Y is what makes males male, the X is what keeps males alive.

Anyone familiar with sex-linked characteristics knows this. Red-green color blindness is found 8 percent of human males and 0.6 percent of human females. Many more women are carriers of color blindness than who are color blind themselves.

The genes responsible for detection of some colors are found on the X chromosome, and are subject to high mutation rates. If a female has a broken copy she usually has a fallback in a functional second copy. She’s a carrier. In contrast, because males have only one X chromosome (inherited from their mother), they don’t have a backup. If a color-vision gene on the X chromosome is broken, then they’re out of luck when it comes to perceiving the full vibrancy of the world.

In other words, the male X chromosome does not possess recessive traits. All traits express due to the state of the single copy of the gene determining the trait. Every mutation on the X chromosome can potentially produce a mutant that will be exposed to natural selection.

Neanderthal-modern human hybrid

This results in some interesting evolutionary quirks when it comes to how natural selection shapes the genome and drives adaptation within populations and speciation between them. Crosses between different species can leave hybrids infertile. In mammals this often happens in males because mutations on the X chromosome can interfere with proper reproductive development. Selection against the genes of other species then happens because males can’t produce offspring.

Studies of Neanderthal admixture confirm this — there is far less Neanderthal ancestry on the X chromosome than across the rest of the genome. There is strong selection against Neanderthal variants in males, because these genes work less well with the rest of the modern human genome.

A wife of Genghis Khan

But the X chromosome is not distinctive just in terms of just natural selection. As two out of three X chromosomes in any population are found in females, its genetic history will be biased toward that sex. Differences between the X chromosome and the non-sex genome can tell us differences in the histories of men and women.

For example historically many more of the female ancestors of admixed people of the New World tended to be non-European, whether it was indigenous or African. As such, the genetic profile of the X chromosome in terms of similarity to worldwide variation would be different from the non-sex chromosomes, because those come equally from the father and mother. This is exactly what we see. There is less European ancestry on the X chromosome.

More generally mating systems such as polygyny — men having multiple female partners — result in far fewer males than females who contribute to future generations. Among Mongols during the era of Genghis Khan, a small number of males descended from Genghis and his Mongol horde had children with numerous women. Because X chromosomes tend to found in women, more of whom are reproducing, they will more diverse than non-sex chromosomes (where a few men contribute half the genes), while the Y chromosome will be the least diverse of all (where only a few men contribute genetic variation).

Men have only one X chromosome, but the one they have is genetically essential to them. X chromosomes are not exclusive to women, but for all males they are the singular legacy of their mothers. Because of this bias the X can shed light on the history of the women of our species, while the uniqueness of inheritance the X chromosome may even extend to driving the emergence of our species.

Explore your Neanderthal story today.


The “X” in the sex chromosome was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

April 25, 2018

DNA, from genetics to genomics

Filed under: DNA,Genetics,Genomics,science — Razib Khan @ 11:59 am

In the early 1950s scientists established that the molecular structure of DNA was a double helix. The had discovered the physical substrate of heredity. With this discovery the field of molecular genetics was born (and eventually a Nobel Prize given!).

And yet we also know that Gregor Mendel discovered the laws of heredity, the “law of segregation” and the “law of independent assortment”, nearly a century before the discovery of DNA.

It was literally the product of a garden.

The mature field of genetics itself developed fifty years before the discovery of the structure of DNA, as a host of scientists stumbled upon Mendelian insights simultaneously. Most were biologists who worked with plants, flies, or even algebra — no need for a powerful microscope or structural models of molecules.

Though DNA has been the key to many of the discoveries of the past fifty years, it is important to remember that the field of genetics is predicated on an abstract understanding of how inheritance works across pedigrees, as opposed to the biophysical basis of that transmission. Before DNA, before chromosomes, what Mendel and his heirs understood is that inheritance occurs through a process where discrete units of heredity, “genes”, are passed down from generation to generation.

These genes usually come in two copies, ‘alleles,’ for many organisms.

Recessive expression patterns of a trait, where parents do not express a characteristic found in their offspring, becomes comprehensible when a Mendelian model is adopted. Prior to this many had an intuitive “blending” understanding of inheritance, where the characteristics of the parents mixed together to produce offspring. The ultimate problem with blending inheritance is that it had difficulty in explaining how variation persisted over time. A problem solved by the Mendelian insight that genetic variation never disappeared…it simply rearranged itself every generation!

Genetics was born on the backs of Drosophila

Between the reemergence of Mendelian thought around 1900 and the discovery of DNA in the 1950s much research occurred in the field of genetics. The Neo-Darwinian Synthesis built upon the mathematical foundations of population genetics, which took the Mendelian framework and formalized and extended them, to create a model of evolutionary biology for the 20th century. Medical geneticists began to understand the patterns of inheritance of rare diseases in humans with the aim of preventing illness. Those researchers working with fruit flies discovered many of the phenomena which define modern genetics, such as recombination. Finally, biochemists established that heredity and nucleic acids were intimately connected.

Just as an understanding of the discrete basis of inheritance in a Mendelian framework opened up the systematic scientific study of heredity, so the understanding of the double helical structure of DNA paved the way for the molecular revolution of the second half of the 20th century, and the genomic revolution of the 21st. An understanding of DNA as the mode of inheritance allowed for the development of techniques that traced transmission of variation at the level of genes themselves, as opposed to expressed traits.

Illumina sequencing machine

And while in the 20th century we spoke of genetics, and specific genes, today we speak of genomes and the whole set of genes organisms possess. That revolution can not be understood without the knowledge of DNA as the mode of inheritance. If classical Mendelian genetics is pattern recognition across pedigrees, 21st century genomics is a synthesis of classical genetics, post-DNA era biophysics, and cutting-edge computing. Genomics is as much engineering as it is science; and “big data” as much as information theory.

The understanding of DNA created the world where genetics transformed itself from an esoteric science of probabilities, to a mass market product of possibilities.

Classical genetics tells you that your relatedness to your brother or sister is expected to be 0.50. Modern genomics might tell you that your relatedness to your brother or sister is shared across 46.24% of your genome. A fuzzy probability becomes a crisp reality. As a science, genetics can be imagined without DNA. It was born and matured decades before we understood the importance of the double helix, but as a part of our lives, one can’t imagine genetics without DNA.

Learn more about where your traits for food tolerance fall on the spectrum and explore your Metabolism story today.


DNA, from genetics to genomics was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

April 19, 2018

Brown fat, the bad kind

Filed under: Obesity,science — Razib Khan @ 2:50 pm

Unless you have been hiding under a rock you know that people of South Asian are at more risk for metabolic disease than is the norm. More concretely we tend toward “skinny fat.”

My current BMI 24. By normal calculators I’m normal weight (barely), because the cut-off is 25. But for South Asian we should be worried if we’re above 23.

There is the caveat that muscle is heavier, so one shouldn’t take BMI literally, as opposed to seriously. You know if you have too much visceral fat, you don’t need to weight yourself. The phenomenon of brown guys with big bellies due to years of self-indulgence is a thing. And excess weight among South Asians who reach a certain affluence level seems a thing the world over.

So here’s a question: for those of you who have managed to keep the weight off and stay trim, how do you do it? Exercise? Diet? Both?

April 18, 2018

The braided estuary of human evolution

Filed under: Genetics,Human Evolution,paleontology,science — Razib Khan @ 2:22 pm

Metaphors matter because they evoke images, and images are often one of the best ways to understand something in a deep fashion. Consider Charles Darwin’s musing:

“It is interesting to contemplate a tangled bank…

He brought something memorable and familiar to make evocative the dynamics at play in his novel theory of evolutionary change through natural selection. The tangled bank has haunted us for over 150 years, though as a friendly apparition to be sure.

Other metaphors are less useful, and even downright destructive. The great chain of being hooks into deep human intuitions about our “special” place in nature and centrality in the universe. “Previously made in the image of our Creator” in the 19th century, science confirmed peoples’ expectations that modern humans are the pinnacle of evolution and the end of a long process of change; from the slouching ape, to the shuffling caveman, and finally, to the upright and thinking man.

The earlier view of Neanderthals was typical and illustrative of where we once were. Originally relegated to a primitive dead-end of our family tree, Neanderthals were depicted as bestial half-men at best. As late as the 2000s many researchers, such as the influential paleoanthropologist Richard Klein, doubted that humans and Neanderthals could produce offspring. There was skepticism from these quarters that Neanderthals could speak, or that they even used fire!

With the confirmation through myriad genetic analyses published from 2010 onward that in fact humans outside of Africa carry 1–2% Neanderthal ancestry, a transformation occurred in our perceptions of our cousins…or rather, our ancestors.

Clearly our understanding of human evolution is conditioned by our cultural preconceptions, our biases. Evolutionary biologists have long warned of the tendency to see in the “tree of life” directionality or purpose, but in the public’s mind the purpose of the universe is manifested in our own lineage. All of the pitfalls that we’ve attempted to avoid when considering evolutionary biology became stark and endemic in the study of humanity.

Unfortunately, paleoanthropology often fed into this narrative because of the paucity of remains.There was very little data, and an empire of theory and supposition cropped up in its place. The prominence of superstar researchers and their associated singular remains, Raymond Dart and the Taung child, Richard Leakey and the Turkana boy, and Donald Johansen and Lucy, highlighted the almost artisanal quality of the field.

As a result of only a few individuals being able to analyze the material evidence for the evolution of our own species, we eventually assembled a relatively neat ascending tree, with a few stray side branches. Like modernist architecture, paleoanthropology constructed a spare and elegant scaffold within which to understand the emergence of what we call humanity. Our story was simple, singular, and implicitly progressive. All paths led to us.

But just as genetics has changed our understanding of the origins of our species, so paleoanthropology itself is undergoing a revolution of sorts because of the veritable flood of data. Remains.

At the end of 2013 I happened to have been present when Lee Berger, a South African paleoanthropologist, presented work that reported on a deep cave where copious remains of a new hominin, Homo naledi, were being assembled and analyzed. Whereas previous researchers often focused on fragments, or the skeleton of a single individual, Berger explain that many remains were to be found in the cave system. This was going to be statistically-sound science, because he had much more than one sample.

To assemble the team that was small and nimble enough to venture into the cave, he reached out to paleoanthropology researchers via social media. And once the data came in, he published it quickly, at the same time releasing the information to other researchers.

The implications for paleoanthropology as it was practiced were revolutionary in and of themselves, but the results were also ground-breaking. H. naledi stood at five feet or shorter. Their cranial capacities were 30% those of modern humans. Meanwhile, their skeletal features were an assemblage of characteristics which seemed both very modern or very ancient. A simple role in a simple story did not present itself.

H. naledi reconstruction

This hominin confounded expectations. If the sample was singular, no doubt there would be skeptics. But Berger had the numbers, so that could not be denied. When the dates came back there was also another shocker: H. naledi flourished a bit over ~200,000 years ago. The reality though is that species invariably are found after and before the datings of particular remains. H. naledi almost certainly occupied the same landscape as early modern humans, who were developing within Africa 200-300 thousand years ago.

Meanwhile, far to east, on the island of Flores, were the Hobbits — H. floresiensis, a diminutive hominin that flourished until modern humans arrived in the region more than 50,000 years ago.

H. floresiensis

The reason that naledi and the Hobbits are important is that they shatter our image of an ascending chain of evolution progressing from lower to higher. The reality that modern human have genes from ancient Eurasian hominins, such as Denisovans and Neanderthals, also refute a simple model whereby humans were born, they came, and they conquered. Humans were both the conquered and conqueror.

Hundreds of thousands of years ago our lineage was highly speciose, with many diverse branches. Modern genetic technology implies that human lineages branch and come back together again and again, like an eternal cycle. The proliferation of ancient remains that are startling in their novelty and shocking in their recency also suggests that the shift in human evolution from slouching, small-brained apes to tall, large-brained apes was not the only way to be human. After all, the largest-brained hominins of all were Neanderthals, who eventually merged back into the much more massive stream of African humans who are our primary forebears.

Maybe you have some Neanderthal or Denisovan in your DNA. Discover your story today with Neanderthal by Insitome.


The braided estuary of human evolution was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

April 11, 2018

The “g” in genes

Filed under: behavioral-science,Genetics,Psychology,science — Razib Khan @ 1:12 pm
What’s next?

Intelligence, or smarts, is once of those words which has many meanings. That’s why we say “street smart” or “book smart.” When psychologists speak of intelligence, however, they are usually referring to something more precise and specific. The image above is a sample of a question from the Raven’s Progressive Matrices test, which is “used in measuring abstract reasoning and regarded as a good non-verbal estimate of fluid intelligence.”

Fluid intelligence “is the capacity to reason and solve novel problems, independent of any knowledge from the past.”

When I was an undergraduate student, my physics professor would often assign problems on the exam which had no explicit corollaries in the problem sets or lecture. During one after-exam review session, one student brought this issue up, and the professor simply offered that given what we learned in the course, we should be able to “derive a method to solve a general class of problem.” I suspect from the distribution of scores that more than half the time a typical student couldn’t derive a method in the allotted period. I know this was often the case with me.

In relation to tests which measure one’s analytic skills or cognitive tasks like memory recall, researchers have found that outcomes are positively correlated. If you do well on one test of this sort, you tend to do well on another such test.

The variable which summarizes these correlations is termed the “general intelligence factor,” often just shortened to g.

When it comes to intelligence, this is what psychologists are really interested in — not the outcome on one specific test. General intelligence is the most distilled and reduced aspect of “book smarts” that psychologists have been able to construct.

So what good is it? More than half of the variation in academic achievement is predicted by variation in g. People in higher status and higher paying jobs tend to have higher general intelligence. And higher g also correlates with a longer lifespan. Because of these correlations it is no surprise that intelligence testing was originally used to identify children who were not performing as well as their peers, and see if they might benefit from special attention.

Not only does general intelligence correlate with many things in one’s life, there is also a correlation between parents and offspring. The most recent work suggests that about 50% of the variation in general intelligence in the population can be accounted for by variation of genes. That is, intelligence is 50% heritable.

Multivariate Gaussian distribution

The implication here is that though parents and children, or siblings, may exhibit a correlation, it is imperfect. The brilliant mathematician Carl Friedrich Gauss came from unexceptional parents, and his numerous descendants are not particularly exceptional. In contrast, the Bernoulli family were a literal mathematical dynasty.

For a complex trait which exhibits a distribution, there are many variables at play, and genes are just one of them. Because so many genes seem to control behavioral and cognitive traits, such as intelligence, until recently, we couldn’t pinpoint any specific region of the genome which impacted variation on these characteristics within the normal range.

With modern genomic methods, which survey variation across the whole genome across huge numbers of people, this is changing. For example, a new paper establishes links to variation in intelligence at over 500 genes! This is still a small number in the grand scheme of things, but whereas five years ago we didn’t know any genes associated with intelligence, today we know hundreds.

A “chip” which asseses thousands of genetic markers

Though indirect methods, such as comparing correlations with and across families, allow us to arrive at a 50% proportion for what is heritable in intelligence, known genomic variation only accounts for a few percent of this heritable component as of this writing. But within the year, it seems likely that the 10% value barrier will be broken, and eventually we may know most of the genetic positions that account for the heritable component of intelligence within human populations.

Then the full story can begin to be told, because once we start to establish the boundaries of the genetic basis of intelligence, we can explore the environmental territory — which accounts for the other 50% our intelligence.

Explore your Regional Ancestry story today.


The “g” in genes was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

Older Posts »

Powered by WordPress