Razib Khan One-stop-shopping for all of my content

May 1, 2019

Privacy in a social genomic age

Filed under: Genetics,Genomics,Privacy — Razib Khan @ 10:56 am

I recently had a long conversation with Veritas Genetics’ Rodrigo Martinez for an episode of The Insight, our podcast on genetics and evolution. One of his major arguments is that we are entering into the age of the social genome.

And the numbers don’t lie. There are more than 30 million Americans who have been genotyped in the consumer sector as of this writing, and Rodrigo contends than within two years his company alone will have sequenced more than one million Americans!

Sequencing is different from genotyping. Instead of looking at hundreds of thousands of markers in a genome of billions, you read the entire sequence.

We are fast approaching total information awareness in the genomic space for a large fraction of Americans. This brings me to a new story in Wired, The US Urgently Needs New Genetic Privacy Laws:

The rise of DNA data has legal experts increasingly concerned that the United States is not effectively protecting consumers from the many privacy risks that now loom before them. “What in heaven’s name is the law in genomics? That is not that easy to answer,” Susan M. Wolf told an audience gathered last Thursday at the University of Minnesota, where Wolf is a professor of law and health policy. “We’ve got 50 states. We’ve got multiple federal agencies involved.” The patchwork of laws means that in practice genetic anonymity is almost never guaranteed. But the legal landscape is so fractured that to fix this situation, the first issue is to resolve what rules apply to what data.

The piece discusses the broader scientific, policy, and current affairs, angles of genomics and how it relates to our personal information. Though the current wave of discussion has been triggered by the forensic revolution triggered after the identification of the Golden State Killer with public databases, people working within genomics have warned for years that the exponential growth of the field was going to necessitate a reckoning.

A major problem in understanding genomic privacy and dealing with it through legislation is that the United States of America has a patchwork of federal and state laws, and, genomics as a field is changing so fast that it is hard to keep up with the areas that legislation might need to address. The federal Genetic Information Nondiscrimination Act of 2008 was useful in its time, but it clearly did not anticipate the reality of a world where everyone can be identified through matches within relative databases.

In the Wired piece, one issue that crops us is that it is important not to treat DNA as if it is special or distinct from general privacy considerations. DNA is important, but it is not magic.

The ubiquity of genomic information must be accompanied by its demystification, and integration into the panoply of personal information which is informative, but not determinative.

Rather than specific laws tailored to the genome, Americans need to focus on the broader issue of privacy. Your credit score is as important to your life as your genome at the end of the day, after all.

Privacy in a social genomic age was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

February 6, 2019

Dreaming of billions of genomes

Filed under: Genetics,Genomics,Privacy — Razib Khan @ 8:43 pm

In the year 2000 scientists finished the draft of the complete human genome. The “reference” for what came after. Even ten years earlier some researchers were questioning the feasibility of any such project! In the early 1990s, many assumed it would be many decades before the first human genome was mapped. What changed?

Technology invaded science. The first human sequence cost three billion dollars. Today one can be had for $1,000. In other words, a genome was three million times more expensive just 20 years ago.

An Illumina sequencer

Instead of the laborious process of tracing inheritance patterns through visible markers, modern genomics utilizes the molecular nature of DNA to enable automation and computation to “read” the full sequence. In less than 20 years we’ve gone from a single human genome sequence to hundreds of thousands of whole genome sequences, and tens of millions of samples which have undergone high-density genotyping using “SNP-array” technology.

Though the human genome is three billion bases, only a small proportion of it codes for genes, and an even smaller proportion holds any variation of interest in a population genetic sense. The millions of genotypes in the databases of private consumer genomic firms may only capture a small number of genetic positions, between 100,000 and one million, but this small number is enough to draw many important conclusions. In particular, what common diseases you are at risk for, and what part of the world your family is likely from, and who your relatives are.

In other words, probably 90% of the things you would want to know about your genetics can be inferred from 0.03% of your whole genome! Today private companies are sitting atop a pot of potential gold because the genome doesn’t change over your lifetime. It is only an appreciating asset as time progresses, as more research unveils details of mechanism and associations.

You are being watched!

Within twenty to thirty years it is likely that a billion human genomes will be sequenced. The field will have fully transitioned from basic science to information technology. And as with any information technology, privacy and data sharing will be important things to consider. It is likely that some governments, like that of China, will have total access to their citizens’ data, while others, such as those of the European Union, will limit access.

But even without top-down invasion of privacy, the proliferation of databases and sequences will mean that one’s genetic information will be shared like credit scores across vendors. And just like with credit scores and histories, there will be data breaches. And while credit scores as ephemeral, your sequence is permanent.

Total strangers may have access to your disease risks, your relatives, and your heritage. Things today which is guarded privately may become totally transparent to anyone who wants to look unless precautions are taken.

The decisions we make today will have consequences for future generations. This applies to individuals, corporations, and the government.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!

Dreaming of billions of genomes was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

The Insight Show Notes — Season 2, Episode 13: Is the FBI Watching Your DNA?

Filed under: Genealogy,Genetics,Privacy — Razib Khan @ 2:30 pm

The Insight Show Notes — Season 2, Episode 13: Is the FBI Watching Your DNA?

This week on The Insight (Apple Podcasts, Stitcher and Google Podcasts)we discussed the controversy that has erupted around Family Tree DNA and genetic privacy. We talked to Judy Russell of The Legal Genealogist and genetic genealogist Debbie Kennett, both longtime observers of the industry and science.

BuzzFeed broke the story. Then Bennett Greenspan of Family Tree DNA responded to their customers. Eventually, the story spread to The New York Times.

The CEO of Family Tree DNA, Bennett Greenspan, is a genetic genealogist himself. His company began as a way to pursue his passion after other successes in business. He talks about this video talking about genealogy and DNA.

This episode is a follow-up in many ways to the Golden State Killer incident.

Because of the international scope of this industry, we extensively discussed the GDPR, the European General Data Protection Regulation.

Interested in learning where your ancestors came from? Check out Regional Ancestry by Insitome to discover various regional migration stories and more!

The Insight Show Notes — Season 2, Episode 13: Is the FBI Watching Your DNA? was originally published in Insitome on Medium, where people are continuing the conversation by highlighting and responding to this story.

May 20, 2018

The end of the century of privacy

Filed under: Privacy,Urbanism,Urbanization — Razib Khan @ 10:40 pm

Reading The Rise and Fall of American Growth: The U.S. Standard of Living since the Civil War has made me think more about the unique nature of urban civilization of the long 20th-century. The expansion of public health, in particular provision of clean water, meant that for the first time in the history of the world you had a situation where people in cities actually had a higher life expectancy than those in rural areas. Prior to this cities were demographic sinks. We have data from the 19th century which makes it clear that morbidity was higher for city dwellers. This is probably the major reason, in my opinion, the cosmopolitan worlds of antiquity had such a marginal demographic impact: the culturally vibrant city-dwellers who dominated Classical civilization politically and socially didn’t leave many descendants.

Even though cities were dominant politically and central to many earlier societies, only in the last century so have predominantly urban societies emerged. Before that most humans lived in villages or in hunter-gatherer bands. Everyone was in everyone else’s business. Anonymity was simply not a thing for most humans in most periods of our species’ history.

This changed with the rise of cities. In the early 2000s the anthropologist Robin Dunbar argued that people could maintain ~150 genuine social relationships in their mind. This is Dunbar’s number. Over the past two decades, there have been lots of arguments about Dunbar’s number. One can stipulate that the value may not be 150. Additionally, it seems likely that some people have a higher Dunbar’s numbers than others. But the general point that human social competencies have a ceiling value seems to be right.

And, that ceiling is smaller than the number of people who live in close proximity to each other in cities. The potential facelessness of your neighbors in a city, and its diversity and cosmopolitanism is one reason that it was in cities that written laws displayed in public places emerged as a custom. Societies not bound together by social interaction and kinship needed abstractions which could scale. Laws, kings, and religions are just some of the cultural inventions that were essential to maintain order in a city where strangers interacted daily.

But were these cities really incubators for anonymity? I would argue that the premodern city offered far less anonymity, and therefore privacy than the modern city. Premodern cities were dense, due to limitations in transportation. They were defined by neighborhoods. Additionally, economic activities in cities were often defined by relationships between people, whether it be between a patron and an artisan, or members of a cooperative guild. In some ways, premodern cities were a collection of villages.

What defined the 20th-century was the rise of massive corporations that rationalized economic consumption and production. The supermarket is cheaper than your local green-grocer, but there is also less of a personal relationship between you and the supermarket staff. Similarly, they may not know who you are. Rather than having economic relationships directly to other people, you have an economic relationship with an institution, which acts as an intermediary.

By the second half of the 20th century, individuals in cities could be totally self-sufficient and isolated from other human beings if they so chose when it came to personal relationships. The rationalization of modern life made deep human interaction a choice, and to some extent, privacy was the default state.

The rationalization of economic relations continues. But over the last 20 years, and especially the last ten or so, the default state of privacy has disappeared. If you know someone’s name you can usually find their age, where they have lived their adult life, who they lived with, and who their relatives are. Websites like Zillow can tell you their home-value or when/if they bought their home and for how much. Facebook, Twitter, and other social media make it so you can find out many things about a person.

Recently a friend of mine who became newly single after ten years in a relationship decided to try out online dating (for the first time). One thing he found is that you have to assume that your matches may have Googled you beforehand (presumably this depends on whether the site gives you full name or not). If you are too shy to talk to your neighbors, just look up who lives at the various addresses around you.  Once you have their names you can find out everything else.

Obviously, modern information technology doesn’t make it so that we live in a premodern village. But, it does mean that the faceless anonymity enabled by rationalized modern economics and socio-political systems is stripped away. In its place, you become a set of values for various parameters (age, income, political orientation, geographical mobility). You don’t know people in a tacit and natural manner, you know them through their data.

Whereas the political and social views of most employees of a corporation were out of view in the 20th-century, today many companies are snooping around in Facebook feeds and doing simple background checks. You may not have a personal relationship with a large company, but it has a relationship with the data that it defines you by.

The 20th-century was the century of privacy because the machinery of information distribution appropriate to hunter-gatherers and villages did not scale to cities. And 20th-century technology never caught up to the scale of the cities and economies of that period in terms of distributing information. As the 21st-century proceeds, it seems that information technology is finally now in place.

November 30, 2010

The naked years: the end of privacy

Filed under: Privacy,Technology,Transparent Society — Razib Khan @ 2:59 am

I do talk periodically on this weblog about the coming ‘transparent society.’ The main reason I bring up the issue is that I think it is probably inevitable, and, I think we’re sliding toward it without even reflecting on it too much. Many people are very surprised at how little time it takes to find information on them in Spokeo and Pipl. Curious about where someone you lost touch with from high school has lived? Go to Intelius.

Rereading David Brin’s original 1996 essay introducing the idea in Wired I’m struck by the fixation on old-fashioned cameras. To me, what people do is almost less interesting than what they’ve done. How much did they buy their house for? Where did they go to university? Did they graduate? Who did they marry? Interestingly, much of this information is offered up freely by the individuals themselves.

And yet what about our genetic code? With the recent 23andMe sale (which continues on, with provisions) I noticed people on Facebook worrying about privacy. Interestingly WikiLeaks has revealed that American diplomats were encouraged to obtain the DNA of foreign notables. Why would they do this? My first thought was that perhaps it would be an easy way to blackmail powerful cuckolds! Though this didn’t seem to cramp Adnan Kashoggi’s style. I assume that powerful individuals don’t have to worry about divulging their disease risks, since they’ll be taken care of. But the reality is that the science is simply not there for a great deal of return when it comes to risk variants. Below is a screenshot of my risks for various diseases from 23andMe as judged from a few single nucleotide variants:


First, these are risks assuming a European genetic background. Which I don’t have. So there’s a problem right there, but 23andMe helpfully notes this boldly if you click through. But setting that aside, I know my risks for Type 2 Diabetes are much greater than average. Why? I have a family history of the disease! That’s why I’m obsessed with visceral fat.

The point is that right now family history is a much better predictor of your risks of a given disease than anything else. Not only does this capture missing heritability, but there is a natural correlation between families and environmental risk factors (or lack thereof). Using the breast cancer risk assessment tool it seems that if you have one first-degree relative who has had the disease you double your own odds of coming down with it over a five year period (though the risks over any given five year period are still low). There has been a lot of warranted attention paid to the BRCA genes, but what about the ability of insurers to digitally analyze the obituaries of your relatives and predict your own probability of death and disease?

I’m not saying that one shouldn’t be worried about divulging one’s genetic data. But it’s only a small piece of the puzzle of what we’re losing.

June 24, 2010

You have no privacy, deal with it

Filed under: Culture,Facebook,Privacy,Technology — Razib Khan @ 11:36 am

The Washington Post’s blogger-journalist Dave Weigel has a post up where he preemptively apologizes for stuff he posted on an “off-the-record” e-list,. Extracts are going to be published by a gossip site. Journalists are the tip of the iceberg; privacy is fast becoming a total fiction, remember that. We’re slowly drifting toward David Brin’s model of a “transparent society”, but it’s happening so fluidly that people aren’t even noticing. And yet as I have noted before, people are resisting the push to merge all their personas into one. Interesting times.

Powered by WordPress