Category Archives: Genetics

I noticed something interesting a few weeks ago in the supplements of the Genomes Asian 1000K paper. Look at where the Toda are on the PCA. Now look at the Indus Valley samples I have…. I don’t have access to the Toda samples. But there’s a lot of evidence that this is a very unique … Continue reading The Todas are more like IVC people than anyone else

Read more

Sometimes people pass me data. Turns out Rajasthani Brahmins are quite different from UP Brahmins (more northwest-shifted). In this, they are like Pandits. In contrast, Bihar Babhans are just like UP Brahmins, who don’t seem to have much structure. Gujarati Brahmins are between South Indian Brahmins and North Indian Brahmins, and closer to the latter, … Continue reading The varieties of Brahmins (and others)

Read more

My previous post on Adivasis was not totally clear. So I’m going to try in shorter fragments and outline things so I’m more clear. I am not 100% correct with the model below (we’ll know more later), but this is my best current conception. 10,000 BC, end of the Ice Age, NW quadrant of the … Continue reading Adivasis are just like everyone else…sort of…but not

Read more

I got a few more samples with provenance. The Bengali Brahmins are shifte the way you would expect. The Bangladesh Kayastha (someone from a Hindu background) is in the cluster with generic Bangladeshis from Dhaka. The West Bengali Kayastha is far less East Asian. My current model right now is that the Kayasthas are basically … Continue reading Bangladesh and West Bengal genetics

Read more

There was some discussion online about variation among South Asians. I decided to compute a few pairwise Fst statistics (measures between population variation) with some South Asian, European and East Asian populations (along with Iranians). I plot them below in two graphs. Also I ran Treemix. I don’t have any major conclusion, just draw your … Continue reading Genetic distances across the world

Read more

There was some discussion online about variation among South Asians. I decided to compute a few pairwise Fst statistics (measures between population variation) with some South Asian, European and East Asian populations (along with Iranians). I plot them below in two graphs. Also I ran Treemix. I don’t have any major conclusion, just draw your … Continue reading Genetic distances across the world

Read more

The Anglo-Saxon migration and the formation of the early English gene pool: The history of the British Isles and Ireland is characterized by multiple periods of major cultural change, including […]

Read more

Since David has not posted, here they are… The genetic history of the Southern Arc: A bridge between West Asia and Europe: By sequencing 727 ancient individuals from the Southern […]

Read more

In the year 2000, there was one single human genome. In 2010 there were fewer than 100 human genomes (you could look them up in a spreadsheet!). Today there are likely 1,000,000 human genomes. Good luck cataloging them all. Outside of the purview of our species, there are now efforts to sequence every animal on earth. And the sequencing revolution has not just changed our understanding of DNA, it has opened up the world of RNA to us, allowing scientists to track and trace gene expression in minute detail. Genomics is “eating biology.”

Whereas there was once a tiny data pond, today a substantial lake is swelling into a massive ocean. This is why we built GenRAIT — to help transition the burgeoning ecosystem of 21st-century genetics into the new age of genomics. Data offers the potential for insight and discovery. Data on life’s code — the genome — can potentially transform the future of human health outcomes. This makes “data” more than just a buzzword, but the key to unlocking the potential for a better world. But the influx and quantity of genome data in our new era threaten to overwhelm the capability of scientists to manage, utilize, and harness it, making that reality we wish to come into being unreachable. We want to push beyond that impasse with GenRAIT, and unlock the potential future.

But what happens when the data is finally brought under control? Data without an end is without purpose. What might the genetic future look like? Why do we at GenRAIT care so much?

One generation ago sequencing one’s genome was “blue sky” science, whereas today it’s a consumer commodity. Companies like Nebula genomics provide 30x high-quality medical-grade sequencing to consumers for $300 or less. With the average cost of health insurance for a family more than $1,000 a month, the cost of sequencing one’s genome is trivial. And whereas buying a car or other consumer item means acquiring a depreciating good, as its value declines over time, your genome sequence becomes more valuable as more research is published on the relationship between genetics and disease.

The more data you have in the pool the more results and findings you can obtain. Thirty years ago detecting a genetic variant that might cause a disease required tracking an inbred pedigree for decades. It was a project only viable for a hospital research group. But science moves forward. Fifteen years ago geneticists began to perform “genome-wide associations” that looked for common variation — those genes commonly causing disease within the population. This is the sort of result a company like 23andMe provides.

But there is more to the genome than things that are known and common. Many illnesses are caused by variations within families and narrow local lineages. If common variants are known unknowns, these are unknown unknowns, and only whole-genome sequences can give us insight. We have the technology, but we lack execution. Every individual’s joint medical and genomic information could be powerful, but only in the context of population-wide analysis of subtle but cumulatively significant patterns. You can only perceive the trees if you can see the forest. The value of one sequence goes up by orders of magnitude when you analyze it in the context of one billion sequences.

As we go into the 21st century, genomics will help us do more than diagnose and evaluate retrospectively. It will be essential to cure, treat, and anticipate the future. An individual’s genome can give doctors a map of how to cater to an individualized healthcare plan. That same genome can be used to prescribe lifestyle changes to improve that person’s future well-being and increase their longevity, impacting morbidity and mortality. It can be used to conserve and save endangered species and help them evolve to better adapt to the present and future environments. We now can imagine a future that can be edited and revised because of new technology.

In 2012 CRISPR genetic-engineering technology took the biological world by storm (and yielded Jennifer Doudna and colleagues a Nobel Prize), making gene-editing available to the broad masses of researchers. Though recombinant DNA technology has been utilized by scientists since the 1970s, it was a form of genetic engineering that was expensive and difficult to execute. CRISPR democratized genetic engineering, opening up the possibility that gene-editing could be a bespoke process, offering up the possibility of curing millions of people with congenital illnesses. Diseases like cystic fibrosis will likely be cured in the next twenty years through gene-editing technology.

Nevertheless, to get to that stage, we need the right environment in place to allow scientists to extract valuable information, patterns, and insights out of the genomes they receive. Before one can write to the genome, one must read the genome. Before one can develop engineering applications, one must master physics. We are already in the genomic age, as sequencing costs keep crashing and new technologies are on the horizon. But the flood of data threatens to overwhelm our capacity to use it rationally, intelligently, and effectively. As the NIH states, “Our ability to sequence DNA has far outpaced our ability to decipher the information it contains.” We must do better because the well-being of hundreds of millions is on the line. We have the data necessary to usher in a better future for healthcare and precision medicine. Now we need to unlock it.

Read more

Introducing GenRAIT to the post-genomic eraThe human genetic map became reality in the first two decades of the 21st century. This was the dream of a century of genetics, laboriously tracing pedigrees across families decade after decade. But the combin…

Read more

The sequences of 150,119 genomes in the UK Biobank: We defined two other cohorts based on ancestry: African (XAF; n = 9,633; Extended Data Fig. 4) and South Asian (XSA; n = 9,252; Extended Data Fig. 5) (Fig. 3a–c). The 37,598 UKB individuals who do not belong to XBI, XAF or XSA were assigned to the cohort OTH (others). … Continue reading Thank God the British are working on South Asian genomics

Read more

A new paper on Southwest Indian genetics highlights the Toda sample from Genomes Asia. People in the comments of this weblog have asserted this small southern tribe may have the most “Indus Valley Civilization” ancestry in the subcontinent. This is perhaps an exaggeration, but, looking at the admixture plots the Toda clearly have hardly any … Continue reading The Toda are different

Read more

ArainGang, has posted a pretty interesting map of various ancestry components in the subcontinent by population. It’s pretty good, especially for the south and west of the subcontinent. But, there is something weird going on in the northeast: a lot of these populations have “Ancestral Indian” (Andamanese) ancestry but hardly anything else East Asian. This … Continue reading Global 25 is good, but a minor issue

Read more

Over at my Substack Iberia: Ancient Europe’s Edge of the Earth (part 1) – Unpacking prehistoric Spanish and Portuguese genetics elicited a comment from Walter Bodmer questioning the representative of […]

Read more

Nick Patterson has responded on his Substack to the NYRB piece Why Biology is not Destiny, which itself is an attack on Kathryn Paige Harden’s book The Genetic Lottery. Patterson […]

Read more

In the recent film The Northman the protagonist, Amleth, has a romantic relationship with a woman, “Olga of the Birch Forest.” Amleth was a Viking who raided Kievan Rus, and […]

Read more

Today is “DNA Day,” I checked Nebula Genomics website to see if there was a deal. So I got the 30x whole genome sequencing for $199+$24.99/month subscription. The deal is […]

Read more

genomes reveal origin and rapid trans-Eurasian migration of 7th century Avar elites: The Avars settled the Carpathian Basin in 567/68 CE, establishing an empire lasting over 200 years. Who they […]

Read more

I was working on a project and decided to check Gujus. A few things 1) A few years ago a Bohra emailed me kind of irritatingly saying I underestimated the non-South Asian ancestry in Bohras. I double-checked and that seems plausible. Looking at this Bohra Patel sample I have, that seems to be clear. 2) … Continue reading Gujurati genetics

Read more

Genetic affinities and adaptation of the South West coast populations of India: Evolutionary event has not only transformed the genetic structure of human populations but also associated with social and cultural transformation. South Asian populations were formed as a result of such evolutionary events of migration and admixture of genetically and culturally distinct groups. Most … Continue reading The southwestern groups in the Indian subcontinent are enriched for “Middle Eastern” ancestry

Read more

40/842
Razib Khan