May 23, 2017

The co-location of the creative class

I have never read Richard Florida’s The Rise of the Creative Class. There are a few reasons for this. First, his thesis was so ubiquitous in the 2000s that a distillation was easy to be had for free. People would write about Florida’s ideas. And he’d talk about it constantly in interviews.

Second, what was true about his model struck me as obvious, and what was not made it less significant. Single professionals in the knowledge economy are not by and large young Mormons who marry young and want to move into spacious tract homes in the suburbs as soon as possible. Rather, they will spend a significant part of their 20s and 30s expending and consuming in and around large cities, and want to be in circumstances where they can meet other like-minded people in similar situations.

But you can’t just create these cities by putting up bike paths. There’s no easy way to mimic what occurred in Silicon Valley. The Valley has a combination of structural (Stanford is there) and contingent factors (California has no non-compete clauses) that help it. It may be that in the modern world there are actually greater returns to locating in an ideapolis which is more expensive.

I recently had a conversation with a friend about academic research. What proportion in a given field of novel and innovative findings are produced by the top 10 institutions? The percentage varies by field and person to person, but I’ve gotten numbers ranging from 40 to 90% from people in different fields. In other words, research productivity is described by a power law as a function of institution.

Similarly, there will be one Silicon Valley, and everyone else will have the scraps. Information technology has not made the landscape flat. The visceral and concrete aspect of “being there” is even more of an advantage in a world where everyone is accessible via email and social media.

This article in Wired, Can the American Heartland Remake Itself in the Image of Silicon Valley? One Startup Finds Out, makes it pretty clear that it doesn’t matter how cool Denver is, it’s going to be hard if you aren’t in Silicon Valley.

The importance of geography and co-location is why I propose that a project for the 21st century should be the construction of a massive arcology between Long Beach and San Diego. Perfect weather on the coast and mountains inland. Aim to house the entire United States population there to start out with.

September 15, 2012

Who tolerates anti-American preaching from Muslims?

Obviously the news over the past week has been filled with the events in the Middle East, and the broader Muslim world, in reaction to an anti-Muslim film. I think the most eloquent commentary is from The Onion (NSFW!!!), No One Murdered Because Of This Image. That being said, there are some serious broader issues here. A friend of mine who lives in India (he is Indian American, though raised for several years in India, so not totally unfamiliar with the culture) has expressed to me his frustration with having to defend American liberalism in a society where American liberalism is an abstraction, rather than concrete. The frustration has to do with the fundamental divergence in basic values. For example, his interlocutors have argued to him (he is a practicing Christian of libertarian political orientation) that if someone committed an act of blasphemy against his faith of course he would react in anger and violence. And yet of course the clause "and" is false, though he is greeted with skepticism when he asserts he wouldn't react violently. As a matter of fact I can attest to the reality that he wouldn't react angrily necessarily, because in interactions where I've

September 8, 2012

The Europes

Planet Money recently did a report on the difficulty of maintaining high economic productivity in southern Italy. I won’t rehash the specifics of the story, but, I think it is important to get a visual sense of just how large the contrast between the south and north of Italy is. Too often we speak of nation-states. Nation-states are real, and they are important, but they are often not comparable. Just like comparing the USA to Sweden is only marginally informative, so comparing a small nation like Ireland to a more substantial one like Italy is deceptive. Here is a 2008 regional GDP map with sub-national breakdowns. Though some of the values are certainly lower now (basically, everything outside of Germany and Sweden), the relationships still hold.

There has been a gap between the north and south of Spain and England, as well as the west and east of Germany, but none of these are of the same magnitude of what you see in Italy (for one, southern Italy is much more populous than eastern Germany).  Sicily and the southern provinces are the poorest regions of western Europe. In contrast, the area

August 3, 2011

The fall of empires as an exponential distribution

I was alerted to Samuel's Arbesman's new paper, The Life-Spans of Empires, by the fact that he pointed to his research on his weblog. Interestingly I'm not the only one who was interested, as after I pointed to it on my link round up a few people asked if they could get a copy of the paper (yes, I almost always send papers if I have access). Luckily it's a nicely elegant piece of work, basically quantifying what we've already probably known qualitatively. There isn't that great of a value-add to quantification as such, but with a mathematical understanding of a topic one can engage in an algebra of mental manipulations so as to construct models with which one can project other facts. Quantitative information is often an excellent way to generate "free information" from theoretical models. The figure above is the primary result of the paper. Basically Arbesman took a data set which was laying around which measured the lengths of various empires (N = 41), and showed that the rise and fall of these political entities tends to follow an exponential distribution: e−λt .

June 27, 2011

First Farmers Facing the Ocean

The image above is adapted from the 2010 paper A Predominantly Neolithic Origin for European Paternal Lineages, and it shows the frequencies of Y chromosomal haplogroup R1b1b2 across Europe. As you can see as you approach the Atlantic the frequency converges upon ~100%. Interestingly the fraction of R1b1b2 is highest among populations such as the Basque and the Welsh. This was taken by some researchers in the late 1990s and early 2000s as evidence that the Welsh adopted a Celtic language, prior to which they spoke a dialect distantly related to Basque. Additionally, the assumption was that the Basques were the ur-Europeans. Descendants of the Paleolithic populations of the continent both biologically and culturally, so that the peculiar aspects of the Basque language were attributed by some to its ancient Stone Age origins.

As indicated by the title the above paper overturned such assumptions, and rather implied that the origin of R1b1b2 haplogroup was in the Near East, and associated with the expansion of Middle Eastern farmers from the eastern Mediterranean toward western Europe ~10,000 years ago. Instead of the high frequency of R1b1b2 being a confident peg for the

May 24, 2011

Iran, Iraq, Syria, Turkey

Since most international migration is apparently between “developing nations”, I thought the Iran-Iraq-Turkey-Syria border would be interesting to look at in terms of differences in economic and social indices.

April 19, 2011

Europeans as Middle Eastern farmers

ResearchBlogging.orgThe Pith: Over the past 10,000 years a small coterie of farming populations expanded rapidly and replaced hunter-gatherer groups which were once dominant across the landscape. So, the vast majority of the ancestry of modern Europeans can be traced back to farming cultures of the eastern Mediterranean which swept over the west of Eurasia between 10 and 5 thousand years before the before.

Dienekes Pontikos points me to a new paper in PNAS which uses a coalescent model of 400+ mitochondrial DNA lineages to infer the pattern of expansions of populations over the past ~40,000 years. Remember that mtDNA is passed just through the maternal lineage. That means it is not subject to the confounding dynamic of recombination, allowing for easier modeling as a phylogenetic tree. Unlike the autosomal genome there's no reticulation. Additionally, mtDNA tends to be highly mutable, and many regions have been presumed to be selectively neutral. So they are the perfect molecular clock. There straightforward drawback is that the history of one's foremothers may not be a good representative of the history of one's

February 24, 2011

A mental map of the world

One of the major issues in our world today is that we’re a people of specialties. This means that we don’t have basic interpretative frameworks in which to place novel facts. Because of the abstruse and formal nature of the discipline, this is probably starkest in the domain of science, but it is not restricted to only science. Consider geography. In many ways this is “low hanging” cognitive fruit in the shallow part of the learning curve which mostly consists of assembly of facts, but because of the shifts in emphases in American education geography has tended to get short shrift. This means that whenever there’s a foreign policy crisis middle-brow journals of record such as The New York Times have to commission pieces about nations such as Libya which read like a “first book” for six year olds on that nation (and on political weblogs commenters proudly brandish their “first book” level of knowledge).

But a bigger general issue seems to be in relation to climate. "Climate Change" is in the news constantly, but the average person on the street seems to have zero historical perspective on events such as the Medieval Warm Period, the Little Ice Age,

February 10, 2011

Swedes not so homogeneous?

Credit: David Shankbone

The more and more I see fine-scale genomic analyses of population structure across the world the more and more I believe that the "stylized" models which were in vogue in the early 2000s which explained how the world was re-populated after the last Ice Age (and before) were wrong in deep ways. I'm talking about the grand narratives outlined in works such as Bryan Sykes' The Seven Daughters of Eve, the subtitle of which was "The Science That Reveals Our Genetic Ancestry." If I had less faith in science to always ultimately right its course I'd probably become a post-modernist type who asserts that all these stories are fictions. Sykes' model in particular seems to be very likely incorrect because of the utilization of ancient DNA to elucidate population movements past in Europe. From what we can gather it looks like coarse attempts to infer past distributions from current distributions (of specific lineages and their diversity) resulted in a great deal of false clarity. We're not talking differences on the margins, but fundamental confusions. For example, Basques were always assumed to be a viable

February 8, 2011

Dodecad open for submissions

Since I know plenty of friends are getting, or just got, their V3 results, I thought I’d pass this on, Open-ended submission opportunity for 23andMe data (#2):

Who is eligible

Everyone who is of European, Asian, or North African ancestry and all four of his/her grandparents are from the same European, Asian, or North African ethnic group or the same European, Asian, or North African country.

Also, Zack has more than 30 individuals in HAP. The “cow belt” is still way underrepresented. The only Bengalis in the data set are my parents.

January 27, 2011

American history in broad strokes

A comment below inquired about “good books” on American history. Unfortunately I don’t know as much about American history as I do about Roman or Chinese history. But over the years there have been several books which I find to have been very value-add in terms of understanding where we are now. In other words, these are works which operate with a broader theoretical framework, and aren’t just a telescope putting a spotlight on a sequence of facts.

Albion’s Seed. I read this in 2004, and it was a page turner.

The Cousins’ Wars. I had thought of Kevin Phillips as a political writer, but this was a very engaging and deep cultural history. My prejudice resulted in my not reading this until 2009.

What Hath God Wrought. This book focuses on the resistance of the Whigs and Greater New England to the cultural ascendancy of the Democrats and their “big-tent” coalition which included most of the South, the Mid-Atlantic, and much of the “Lower North” (e.g., the “butternut” regions of the Midwest settled from the Border South).

The Rise of American Democracy. This is a good compliment to the previous book, in that it takes the “other side,” that of the Democrats. In many ways this is the heir to Arthur Schlesinger’s Age of Jackson.

Throes of Democracy. A somewhat “chattier” book than the previous ones, it is still an informative read. It covers a period of history with the Civil War as its hinge, and so gives one the tail end of the Age of Sectionalism.

Freedom Just Around the Corner. By the same author, but covering a period of history overlapping more with Albion’s Seed.

The Age of Lincoln. This is not a “Civil War book.” It is of broader scope, though since the the war is right in the middle of the period which the book covers it gets some treatment. I’d judge this the “easiest” read so far of the list.

Replenishing the Earth. This is about the Anglo world more generally, but it is nice to plug in America into a more general framework. North America is not sui generis.

The English Civil War. This is obviously not focused on America, but it is a nice complement to Albion’s Seed, as it shows the very deep roots of the division between two of America’s folkways. The Cousins’ Wars serves as a bridge between the two, shifting as it does between both shores of the Atlantic.

I’m game for recommendations! I had a relatively traditional education in American history, and did very well in my advanced courses, but I knew very little before I read books like this.

December 31, 2010

Mapping the “Green Sahara”

Guelta d’Archei, Chad. Credit: Dario Menasce.

Everyone who is literate knows that the Sahara desert is the largest of its kind in the world. The chasm in cultural, biological, and physical geography is very noticeable. Northern Africa is part of the Palearctic zone, while the peoples north of the Sahara have long been part of the circum-Mediterranean population continuum. The primary continuous habitable corridor is that of the Nile valley. And yet scholars have long known that there has been variation in the climatic regime of the Sahara. The pharaohs of ancient Egypt seem to have hunted a wider range of fauna than is to be found in the deserts surrounding the current Nile valley, perhaps relics from a more humid period. Rock art in some regions of the desert indicate aquatic life, and species more characteristic of the savanna. And yet we should not think of the Sahara as a recent phenomenon; it does seem to be geologically ancient, despite periodic humid interregnums.

ResearchBlogging.orgA new paper in PNAS attempts to map the hydrography of the Sahara over the Holocene, as well as back to the Pleistocene. The ultimate aim seems to be to better frame the geographic constraints on the expansion of humanity from its African homeland, and refute a simple projection from the present to the past. In this case, it is the existence of the Nile as a verdant and habitable watercourse which connects the north and south, and bisects the continuous desert. Ancient watercourses and biogeography of the Sahara explain the peopling of the desert:

Evidence increasingly suggests that sub-Saharan Africa is at the center of human evolution and understanding routes of dispersal “out of Africa” is thus becoming increasingly important. The Sahara Desert is considered by many to be an obstacle to these dispersals and a Nile corridor route has been proposed to cross it. Here we provide evidence that the Sahara was not an effective barrier and indicate how both animals and humans populated it during past humid phases. Analysis of the zoogeography of the Sahara shows that more animals crossed via this route than used the Nile corridor. Furthermore, many of these species are aquatic. This dispersal was possible because during the Holocene humid period the region contained a series of linked lakes, rivers, and inland deltas comprising a large interlinked waterway, channeling water and animals into and across the Sahara, thus facilitating these dispersals. This system was last active in the early Holocene when many species appear to have occupied the entire Sahara. However, species that require deep water did not reach northern regions because of weak hydrological connections. Human dispersals were influenced by this distribution; Nilo-Saharan speakers hunting aquatic fauna with barbed bone points occupied the southern Sahara, while people hunting Savannah fauna with the bow and arrow spread southward. The dating of lacustrine sediments show that the “green Sahara” also existed during the last interglacial (∼125 ka) and provided green corridors that could have formed dispersal routes at a likely time for the migration of modern humans out of Africa.

This paper was written before the Denisovan admixture results shifted the necessity to genuflect so explicitly to Out of Africa. But its results are interesting nonetheless, since they don’t depend too deeply on a paleoanthropological model. Rather, by surveying biogeogeography and geologic data they produce a sense of how the Sahara exhibited climatic flux over the past 100,000 years as a function of time and space. The latter is important because the Sahara is not an amorphous sandy waste. Rather, it exhibits a great deal of topographical variation:

Credit: T L Miles

In the Tibesti mountains the highest peaks are ~11,000 feet above sea level (3,400 meters). Because of the aridity of the Sahara in general even these elevations does not induce sufficient precipitation to produce a “green mountain” effect, common in other arid parts of northern Africa and Arabia. But in a regime of slightly only higher precipitation and milder temperatures (remove 3 degrees fahrenheit per 1,000 feet against latitude controlled sea level temperature) one can imagine the Tibesti having been much more biologically productive in the past. Consider this from the Tassili n’Ajjer region of southern Algeria:

Because of the altitude and the water-holding properties of the sandstone, the vegetation is somewhat richer than the surrounding desert; it includes a very scattered woodland of the endangered endemic species Saharan Cypress and Saharan Myrtle in the higher eastern half of the range.

The range is also noted for its prehistoric rock paintings and other ancient archaeological sites, dating from neolithic times when the local climate was much moister, with savannah rather than desert. The art depicts herds of cattle, large wild animals including crocodiles, and human activities such as hunting and dancing….

The main thrust of the paper seems to be to refute the common assumption that an eternal Nile served as the north-south corridor for African fauna, including humans. Here is the reason:

Reanalysis of the Saharan zoogeography…suggests that many animals, including water-dependent creatures such as fish and amphibians, dispersed across the Sahara recently. For example, 25 North African animal species have a spatial distribution with population centers both north and south of the Sahara and small relict populations in central regions. This distribution suggests a trans-Saharan dispersal in the past, with subsequent local isolation of central Saharan populations during the more recent arid phase. If a diverse range of species (including fish) can cross the Sahara, it is impossible to envisage the Sahara functioning as barrier to hominin dispersal. The zoogeography of the Nile suggests that it was a much less effective corridor…Only nine animal species that occupy the Nile corridor today are also found both north and south of the Sahara….

There are also isolated pieces of evidence which refute a Nile-only model: Saharan oases which have endemic species of crocodiles. The existence of “desert crocodile” populations is a signal of a more well-watered past, with a subsequent retreat into isolated oases (some of these populations did go extinct in the 20th century though). In some ways this is a problem. Simple models make simple predictions, and are easier to test. But if simple models are false, that is an even greater problem.

Here are the figures which outline the primary results from geology and biogeography:

There are two primary inferences made in regards to humans:

1) The Holocene inference seems to be that Nilo-Saharan populations have their origins in the societies which expanded north and south along the liminal zone of the Sahara. The authors argue that Nilo-Saharan populations on isolated oases in the northern Sahara are relics from the past expansion in the early Holocene. This sounds plausible, but it would be nice to explore this in more depth via linguistic and genetic analysis. With the rise of the camel and Islam a trans-Saharan trade in humans may have resulted in a great deal of trans-location of whole populations from one area to another. Concurrent with the Nilo-Saharans who pushed north the authors also suggest that savanna hunters moved south. I am not clear who these people are from the paper, and the mapping between archaeology and linguistics here seems more tentative.

2) A deep history inference also seems to be that trans-Sahara population movements were feasible in a period around 120-100 years BP, but not 50-60 years BP. The distinction here matters because the latter is a relatively young age for the Out of Africa migration, while the former is an older one. If the latter view is correct then the only plausible route of migration is probably the coastal fringe of the Horn of Africa. If the former view is correct then a whole host of possibilities confront us, because the hydrography of the Sahara may have been constrained, but there were several avenues of migration.

In regards to #2, a clement phase, and then resealing of the genetic barrier, may align well with recent models which posit a non-trivial period of separation between Africans and non-Africans after the Out of Africa migration. In other words early modern humans may have followed the pattern of many species, with  an expansion into, and beyond, the Sahara, and then a subsequent separation of two populations by a resurgent desert. The difference is that the daughter population isolated on the far side of the desert eventually “broke out” from the margins of the African homeland to the rest of the world.

Citation: Drake NA, Blench RM, Armitage SJ, Bristow CS, & White KH (2010). Ancient watercourses and biogeography of the Sahara explain the peopling of the desert. Proceedings of the National Academy of Sciences of the United States of America PMID: 21187416

December 14, 2010

Re-visualizing European ancestry

I decided to take the Dodecad ADMIXTURE results at K = 10, and redo some of the bar plots, as well as some scatter plots relating the different ancestral components by population. Don’t try to pick out fine-grained details, see what jumps out in a gestalt fashion. I removed most of the non-European populations to focus on Western Europeans, with a few outgroups for reference.

Here’s a table of the correlations (I bolded the ones I thought were interesting):

W Asian NW African S Europe NE Asian SW Asian E Asian N European W African E African S Asian
W Asian * -0.01 -0.18 0.04 0.81 0.59 -0.64 0.39 0.2 0.04
NW African * * 0.19 -0.16 0.23 -0.09 -0.19 0.26 0.67 -0.11
S European * * * -0.38 -0.03 -0.27 -0.42 -0.11 -0.02 -0.36
NE Asian * * * * -0.06 0.5 0.26 -0.04 -0.1 -0.07
SW Asian * * * * * 0.21 -0.62 0.74 0.59 -0.13
E Asian * * * * * * -0.27 0.08 0 0.14
N European * * * * * * * -0.34 -0.28 -0.31
W African * * * * * * * * 0.86 -0.04
E African * * * * * * * * * -0.07


October 25, 2010

Body odor, Asians, and earwax

EarWhen I was in college I would sometimes have late night conversations with the guys in my dorm, and the discussion would random-walk in very strange directions. During one of these quasi-salons a friend whose parents were from Korea expressed some surprise and disgust at the idea of wet earwax. It turns out he had not been aware of the fact that the majority of the people in the world have wet, sticky, earwax. I’d stumbled onto that datum in the course of my reading, and had to explain to most of the discussants that East Asians generally have dry earwax, while convincing my Korean American friend that wet earwax was not something that was totally abnormal. Earwax isn’t something we explore in polite conversation, so it makes sense that most people would be ignorant of the fact that there was inter-population variation on this phenotype.

But it doesn’t end there. Over the past five years the genetics of earwax has come back into the spotlight, because of its variation and what it can tell us about the history and evolution of humans since the Out of Africa event. Not only that, it seems the variation in earwax has some other phenotypic correlates. The SNPs in and around ABCC11 are a set where East Asians in particular show signs of being different from other world populations. The variants which are nearly fixed in East Asia around this locus are nearly disjoint in frequency with those in Africa. Here are the frequencies of the alleles of rs17822931 on ABCC11 from ALFRED:

ResearchBlogging.orgThe expression of the dry earwax phenotype is contingent on an AA genotype, it has recessive expression. So in a population where the allele frequency of A ~0.50, the dry earwax phenotype would have a ~0.25 frequency. In a population where the A allele has a ~0.20 frequency, the dry earwax phenotype would be at ~0.04 frequency. Among people of European descent the dry earwax phenotype is present at proportions of less than ~5%. Because of recessive expression a larger minority of Japanese and Chinese should manifest wet earwax, though interestingly the ALFRED database indicates that Koreans are fixed for the A allele. In Africa conversely the G allele seems to be fixed.

So the question is: why? A new paper in Molecular Biology and Evolution argues that the allele frequency differences are a function of positive directional selection since humans left Africa ~100,000 years ago. The impact of natural selection on an ABCC11 SNP determining earwax type:

A nonsynonymous single nucleotide polymorphism (SNP), rs17822931-G/A (538G>A; Gly180Arg), in theABCC11 gene determines human earwax type (i.e., wet or dry) and is one of most differentiated nonsynonymous SNPs between East Asian and African populations. A recent genome-wide scan for positive selection revealed that a genomic region spanning ABCC11LONP2, and SIAH1 genes has been subjected to a selective sweep in East Asians. Considering the potential functional significance as well as the population differentiation of SNPs located in that region, rs17822931 is the most plausible candidate polymorphism to have undergone geographically restricted positive selection. In this study, we estimated the selection intensity or selection coefficient of rs17822931-A in East Asians by analyzing two microsatellite loci flanking rs17822931 in the African (HapMap-YRI) and East Asian (HapMap-JPT and HapMap-CHB) populations. Assuming a recessive selection model, a coalescent-based simulation approach suggested that the selection coefficient of rs17822931-A had been approximately 0.01 in the East Asian population, and a simulation experiment using a pseudo-sampling variable revealed that the mutation of rs17822931-A occurred 2006 generations (95% credible interval, 1023 to 3901 generations) ago. In addition, we show that absolute latitude is significantly associated with the allele frequency of rs17822931-A in Asian, Native American, and European populations, implying that the selective advantage of rs17822931-A is related to an adaptation to a cold climate. Our results provide a striking example of how local adaptation has played a significant role in the diversification of human traits.

The region around ABCC11 has come under scrutiny with the emergence of tests of natural selection predicated on inspecting patterns of linkage disequilibrium (LD). LD is basically measuring the association of genetic variants within the genome shifted away from expectation. A selective sweep tends to generate a lot of LD around the target of natural selection because as the allele in question rises in frequency its neighbors also hitchhike along. The hitchhiking process means that within a population you may see regions of the genome which exhibit long sequences of correlated single-nucelotide polymorphisms (SNPs), haplotypes. An initial selective event will presumably generate a very long homogenized block, which over time will break apart through recombination and mutation, as variation is injected back into the genome. The extent and decay of LD then can help us gauge the time and strength of selection events.

But LD can emerge via other processes besides natural selection. Imagine for example that a population of Africans and Europeans mix in a given generation. Europeans and Africans have different genetic makeups, on average, so the initial generations will have more LD than expectation because recombination will only slowly break apart the physical connection between genomic regions from European and African ancestors. The decay of LD then can give one a sense of the time since admixture as well as selection. Not only that, stochastic demographic events and processes are also important and may drive the emergence of LD. Consider a bottleneck where the frequency of a particular haplotype is driven up by random genetic drift alone. The details of these alternative scenarios are explored in the 2009 paper The role of geography in human adaptation.

All this is preamble to the fact that there’s a lot of LD around ABCC11. Here’s a visualization from the HapMap populations:


abc11From left to right you have Chinese & Japanese, Utah whites, and the Yoruba from Nigeria. An absolute value of D’ ~0 means that there’s linkage equilibrium; the default or null state where there are no atypical excessive correlations of alleles across the genome. The axes here are pairwise combinations of SNPs around ABCC11, with a focus around rs17822931, a nonsynonymous SNP which seems to be the likely functional source of the variance in earwax and other phenotypes. In terms of LD rank order the results are not surprising, across the genome East Asians tend to exhibit more LD than Europeans, and Europeans exhibit more LD than the Yoruba. Part of this is probably a function of population history, a serial bottleneck model Out of Africa would posit that drift and other stochastic forces would have a stronger impact on the genomes of East Asians than Europeans. But this seems like it can’t be the whole picture here; note the variance in allele frequency in the New World as well as in Oceania. Some of the Amerindian populations seem to have a higher frequency of the ancestral G allele on rs17822931. The figure above is easier to understand, the Y-axis is showing you the extent of heterozygosity at a given location. GA is heterozygous, GG is homozygous. Africans again tend to exhibit more heterozygosity than non-Africans, but note the sharply diminished heterozygosity for the East Asian sample around rs17822931 in ABCC11. Remember that heterozygosity tends not to go above 0.50 in a random mating population in a diallelic model (though in selective breeding it may go above 0.50 for F1 generations).

The major findings of this paper beyond what was known before seem to be a) an explicit model of how East Asians could have arrived at a high frequency of the AA genotype at rs17822931, and, b) the correlation between climate and the frequency of A. I’ll get to the second point in a bit, but what about the first? Using the nature of variation in two microsatellites flanking the SNP of interest in East Asians, and assuming a recessive selection model, the authors posit that the A allele began to rise in frequency ~50,000 years ago, and, that the selection coefficient was ~1% per generation. This a significant value for the selection parameter, and the timing is possible in light of the separation of non-Africans into a western and eastern group around that period.

But honestly I’m pretty skeptical of this. The confidence intervals don’t inspire confidence, and from what little I know selection for recessive traits should exhibit less linkage disequilibrium. At low frequencies there is very little affect of natural selection on the allele because it is mostly “masked” in heterozygotes, and therefore there will be a long period before its proportion begins to rise more rapidly. During this time recombination will have time to chop up the haplotypes around the SNP, reducing the length of the statistically associated haplotype block. Also, the authors themselves don’t seem to believe that the phenotype of earwax itself was the target of selection, so its recessive expression pattern should be less important from where I stand.

abcc11dThe idea that the genes around ABCC11 might have something to do with adaptation to cold is suggestive, but almost every East Asian trait of distinction has been hypothesized to have something to do with cold at some point by physical anthropologists. You’d figure that the Cantonese lived in igloos going by all the myriad adaptations to frigid conditions which they exhibit. The reality is that much of China, Korea and Japan are subtropical today. In any case the last figure shows the correlation across several lineages. Earlier they found that by comparing variation around this region in humans with other primates that Africans seem to be subject to purifying selection. This means that there’s constraint so that neutral forces don’t change the frequencies of functionally significant regions. It is well known that on average Africans are more diverse than non-Africans, probably because the latter are a sampling of the former, but, on a small minority of genes the reverse is true. This is likely due to the relaxation of functional constraint as humans left the ancestral African environment. And this is clearly true for rs17822931; most non-African populations exhibit some heterozygosity. East Asians here are an exception, not the rule, at having derived allele frequencies nearly fixed. The regression lines in this last figure are all statistically significant. It is interest that there are particularly strong correlations between latitude and and frequency of the derived A allele among Europeans and Native Americans. In contrast the relationship within Asian populations is weaker. Only 17% of the allele frequency variance can be explained by latitude variance among the Asian ALFRED sample.

But we shouldn’t allow the hypothesis to rise and fall just on this evidence. After all there have likely been substantial movements of populations within the last 10,000. Perhaps especially in East Asia, where the expansion of the Han south may have triggered the movement of both the Thai and Vietnamese people out of South China and into mainland Southeast Asia. The best evidence of adaptation would be among admixed populations; presumably those at higher latitudes would have higher frequencies of the AA genotype than those at lower latitudes. Instead of categorizing the populations into three coarse classes probably a more sophisticated treatment using ancestral quanta derived from STRUCTURE or ADMIXTURE as independent variables would be informative. Remember, adaptation should show evidence of decoupling ancestry from phenotype.

Finally, I have to point to this section of the discussion:

What is the cause of the selective advantage of rs17822931-A? Although the physiological function of earwax is poorly understood (Matsunaga 1962), dry earwax itself is unlikely to have provided a substantial advantage. The rs17822931-GG and GA genotypes (wet earwax) are also strongly associated with axillary osmidrosis, suggesting that the ABCC11 protein has an excretory function in the axillary apocrine gland (Nakano et al. 2009)…,

I really didn’t know what this meant. So I looked it up. Here’s what I found, A strong association of axillary osmidrosis with the wet earwax type determined by genotyping of the ABCC11 gene:

Apocrine and/or eccrine glands in the human body cause odor, especially from the axillary and pubic apocrine glands. As in other mammals, the odor may have a pheromone-like effect on the opposite sex. Although the odor does not affect health, axillary osmidrosis (AO) is a condition in which an individual feels uncomfortable with their axillary odor, regardless of its strength, and may visit a hospital. Surgery to remove the axillary gland may be performed on demand. AO is likely an oligogenic trait with rs17822931 accounting for most of the phenotypic variation and other unidentified functional variants accounting for the remainder. However, no definite diagnostic criteria or objective measuring methods have been developed to characterize the odor, and whether an individual suffers from AO depends mainly on their assessment and/or on examiner’s judgment. Human body odor may result from the breakdown of precursors into a pungent odorant by skin bacteria….

Perhaps the paper should have been titled “why barbarians smell bad”? In any case, an idea for a book title on Korean genetics: “the least smelly race.”*

Citation: Ohashi J, Naka I, & Tsuchiya N (2010). The impact of natural selection on an ABCC11 SNP determining earwax type. Molecular biology and evolution PMID: 20937735

* I’m referencing The Cleanest Race.

October 21, 2010

Borders we forget: Saudi Arabia & Yemen

There’s a lot of stuff you stumble upon via Google Public Data Explorer which you kind of knew, but is made all the more stark through quantitative display. For example, consider Saudi Arabia and Yemen. In gross national income per capita the difference between these two nations is one order of magnitude (PPP and nominal). Depending on the measure you use (PPP or nominal) the difference between the USA and Mexico is in the range of a factor of 3.5 to 5. Until recently most Americans did not know much about Yemen. It was famous for being the homeland of Osama bin Laden’s father and the Queen of Sheba.

Let’s do some comparisons.

Good luck Saudi Arabia! :-) Couldn’t happen to a nicer nation.

August 16, 2010

Empires of the Word & anti-Babel

Languages_of_EuropeTo the left you see a map of the distribution of languages and language families in Europe. Language is arguably the most salient cultural feature of our species, as well as one of the most obviously biologically embedded. The trait of language is a human universal, to the point where even those without hearing can create their own gestural languages de novo. But the specific nature of language as it is instantiated from region to region varies greatly. Language in the generality is a straightforward utility with which you communicate with your fellow man. But language also separates you from your fellow man.

European nationalism in the 19th and 20th centuries was in large part rooted in the idea that language defined the boundaries of a nation. During the Reformation era some German-speaking Roman Catholic priests declaimed the value of the bond of language against that of religion, praising those non-Germans who adhered to the Catholic cause against German speaking heretics (in the specific case the priest was defending Spanish tercios brought in by the Holy Roman Emperor to put down the rebellion of Protestant German princes). In the long centuries between the Reformation and the Enlightenment the idea of a Western Christian Commonwealth slowly melted in the face of the rise of vernacular, but even after the shattering of Western Christianity with the explosion of Reformations the accumulated capital of a unified Christian European elite persisted. Hungarian Protestant students at Oxford could make do with Latin even if they were totally innocent of English (see The Reformation). Newer lingua francas, French and later English, lack the deep unifying power of Latin in part because they are also living vernaculars. They may resemble Latin in some particulars of function, but eliding the differences removes far too much from the equation to be of any use. Linguistic diversity is a fact of our universe, but how it plays out matters a great deal, and has mattered a great deal, over the arc of history.

806-8This is the subject of Empires of the Word: A Language History of the World. Nicholas Ostler, the author, tackles an enormous subject here. He acknowledges the Herculean nature of his task in the introduction. And yet he does avoid some of the more intractable controversies within historical linguistics by constraining his subject matter to the period of history. That is, where we have some written records. This means that Ostler does not address the origins of the Indo-European language family, or the more recent expansion of the Bantus. Despite being separated by thousands of years these are both in the domain of pre-history, because we have no written records of proto-Bantu or proto-Indo-European. This does not mean that the book is not ambitious all the same. On the contrary, Empires of the Word takes on the “thicker” and messier tangle which is the association between language and fine-grained historical processes, social, cultural, economic and political. How history has shaped the nature and distribution of languages which we see extant in our world today is a labyrinth with many doors. Ostler doesn’t come close to opening the majority of those doors, but those he selects in Empires of the Word yield a rich number of surprises and insights, though he does not in the end seem to be able to generate a Grand Unified Theory of linguistic diversity and change from the welter of details.

top-20-languagesThere are two parallel threads throughout Ostler’s narrative: description and prediction. The latter is not prediction as a physicist would predict, rather, it is as a historical scientist might. Taking the data and producing models which can plausibly explain the phenomena we describe. Let’s take a look at the top 20 languages in the world. It seems that there are two primary ways that the speakers of a language can become numerous: rice & empire. Such a generalization is a bit glib, as many Mandarin speakers do not live by the “rice bowl,” but the big picture is that some languages gained adherents through “brute force,” pushing inexorably against the Malthusian possibilities of primary production and reproduction and assimilating smaller groups on the wave of advance of the speakers. The Asian languages on this list fall into that category. In contrast, you have the languages which spread with empire, exploration, and colonialism. English and Spanish are the exemplars of this class. Of the hundreds of millions of English and Spanish speakers a majority can not be accounted for simply by demographic expansion of the home countries. Rather, these languages colonized new lands, and acquired new speakers, rather rapidly over the past 500 years. Turkish is almost certainly in this category, though the transition from Greek, Armenian and Kurdish speech in Anatolia is less clearly understood because of thinner textual records of the process.

Of course the distinction between the two is somewhat artificial. The expansion of Mandarin, let alone the Chinese dialects, was almost certainly a synthesis of demographic expansion & migration, and linguistic assimilation of “barbarians.” Han Chinese are a genetically far less homogeneous than the Koreans or Japanese, in large part because the expansion of Han identity occurred over  a diverse group of populations which were resident within China proper 2,000 years ago. Similarly, it seems implausible that the Vietnamese ethnically cleansed all the Malay and Khmer speaking populations along the Annamese coast as they pushed toward the Mekong delta. The genetic data in fact hint to a large scale assimilation of Malay Chams by the Vietnamese. Inversely, the rise of English was partially accompanied by the demographic explosion of British peoples, while Spaniards contributed a great deal genetically to the mestizo populations of the New World. So it is not rice or empire, but rice and empire. Albeit with different weights on a case-by-case basis.

“Rice” really refers to social, cultural and economic forces which bubble up from below and swallow up the numerous islands of linguistic diversity. “Empire” connotes the political and military structure which allows for the trickle down from above of imperial values and mores. But the two are also intimately connected. The Chinese state under the Ching Dynasty saw a rapid rise in population, and that rise was enabled in large part due to political stability. That stability fostered long term projects which increased the land under cultivation as well as public works infrastructure which could distribute grain so as to dampen the effect of local shocks. The Greek historian Polybius attributed the resiliency and strength of the Roman state in to its assimilative capacity, turning barbarians into citizens. The military and political resiliency of the Roman Empire through the Crisis of the Third Century was probably conditioned on the expansion of Romanitas from the the Atlantic to the Black Sea (the military core of the revival drew from the Latin speaking regions south of the Danube in the Balkans).

Just as the Roman Catholic Church is sometimes referred to as “the ghost of the deceased Roman Empire,” so the distribution of modern languages are tells of political, social and economic events of the past. Social and economic forces almost certainly loom large in language family explosions which Ostler did not cover, that of the Bantus, the Polynesians and Indo-Europeans. In the first case it seems that the Bantu peoples brought with them a new mode of production to east and south Africa. This was then a rice expansion, along with some genetic assimilation. The case of the Polynesians is more difficult, but the existence of a similar group in Madagascar, attests to the power of long distance seafaring techniques in scattering obscure peoples. Without the existence of Malagasy, both their genetic and linguistic uniqueness, the written record would not clue us in to the existence of an organized community of long distance seafaring Southeast Asians across the Indian ocean basin. Finally, the Indo-European expansion is more mysterious because it is so much further back in time, but it is also the most significant as nearly half the world’s population speaks an Indo-European language. David Anthony in The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World makes the case that a shift toward nomadic pastoralism enabled by the horse is the critical catalyst for the sweep of this language group from the Atlantic to the Bay of Bengal.

Though the Indo-European case is likely an ancient one Empires of the Word actually begins its story earlier. Ostler’s in depth knowledge of ancient Near Eastern linguistic history is frankly mind-blowing, and is arguably the most insightful and novel spin on the topic I’ve ever encountered. The extent of detailed and subtle grasp of the facts is awe inspiring. I did not know, for example, that the Elamites of southwest Iran once had their own writing system, which they eventually abandoned for Akkadian cuneiform. Ostler recounts the life-after-death which Sumerian experienced for over 1,000 years because of the nature of cuneiform itself, which was fitted to the Sumerian language, a linguistic isolate with no known relatives. For the last thousand years of cuneiform it was written in Akkadian, the first great Semitic language in the world, later to be succeeded by Aramaic, Punic, Hebrew, and Arabic. Parallel to the waxing and waning of these antique Semitic languages was the ebb and flow of ancient Egyptian, with its own peculiar form of writing.

One aspect of these ancient societies and their languages is the almost cold-blooded torpidity with which change occurred. Sumerian persisted as a liturgical language in what became Babylonia down to the Roman and Parthian period, 3,000 years of written history. The social-political entity which we term ancient Egypt arguably spanned 2,500 years, up until the final Persian conquest. Egyptian culture in a sense that the Pharaohs would recognize persisted for another 1,000 years, until the closure of the Temple of Philae under the orders of the Christian Emperor Justinian in the 6th century. This cut the last link with the literature and religion of ancient Egypt. Consider that the time between our own era and that of Jesus Christ is equivalent to that between the rise of the Egyptian polity and its decline in the late Bronze Age. Though there are certainly similarities between Paul of Tarsus and a modern Western man, a great many disruptions have broken chains of cultural continuity.

There may be one exception to this, and that is another language which arose just as Egypt went into decline, and that is Chinese. Classical Chinese in its written form remained relatively static between the ancient period of the first dynasties, and the early 20th century. This continuity is telling insofar as Western scholars never had to “discover” the history of the Chinese, they had always remembered it. The continuity of language, culture values, and political and ethnic identity, dovetailed together so that despite the reality that the architecture of China is ephemeral, its stories are not. In contrast, much of the literary corpus of the ancient Western world comes down to us only because of three intense periods of copying: the Carolingian Renaissance, 10th century translations in the Byzantine Empire, and the Abbasid translation project in the 9th century. The history of the societies before Greece was perceived only obliquely through the Bible and the classical authors. Modern archaeology and linguistics eventually unlocked the secrets of both hieroglyphics and cuneiform, but the reality that we did not know of the significance of the Hittites in the ancient world attests to the poverty of knowledge which lack of cultural continuity imposes (the great disruption between the Indus civilization and pre-Maurya India means that the script of the former remains lost to us).

The distribution and continuity of dead languages also is a signpost for that other aspect of human culture which is very powerful and ubiquitous: religion. Today most of the Latin spoken is “Church Latin,” and that is because of the languages sacred role within the Roman Catholic Church. Though Hebrew is the spoken language of the secular state of Israel thanks to a modern revival, for nearly 2,500 years it was a language of religion only, as the Jews adopted the languages of the people amongst whom they lived, Aramaic, Greek, Persian, Arabic, Latin, German, etc. The ancient languages of the Near East, Coptic from ancient Egyptian, and Syriac from Aramaic, persist as liturgical languages. It seems that so long as the gods do not die in the minds of believers the tongues of the ancients persist down the ages. So next to the language of rice and empire, you have languages of the gods.

As I indicated above Empires of the Word is rather thin on robust generalizations. But one point which the author mentions repeatedly is that the rise and fall of languages of great expanse and utility is the norm, not the exception. In particular, Nicholas Ostler takes time out to emphasize that languages which spread via trade often do not have long term staying power. Portuguese, Aramaic, Punic and Sogdian would fall into this category (the later success of Portuguese was a matter of rice and empire in Brazil). It seems that mercantile communities are too ephemeral, that successive historical shocks inevitably result in their decline when there isn’t a peasant demographic reservoir or imperial power which imposes it by fiat. Even those languages which eventually spread beyond traders and gain cultural and political cachet may fall from grace. Greek is the best case of this. It was the dominant language of the Roman East, and spoken as far as modern Pakistan, and studied in Dark Age Ireland. By the early modern period it was a strange and foreign language in the West, and with the rise of Islam in the east it lost its cultural glamor, and even those Christians in Arab lands who were Melkite, Greek Orthodox who adhered to the theological position of Constantinople, became Arab in speech and identity (in greater Syria the Greek Orthodox have been instrumental in the formulation of Arab nationalism).

And yet to some extent one must be cautious about over-reading the recession of Greek in the face of Arabic after the rise of Islam. Ostler repeats the conventional wisdom that the predominant vernacular in the Roman East was never Greek, but rather Semitic dialects descended from Aramaic. This is manifest in the fact that the Oriental Orthodox churches do not use Greek in their liturgy, but forms of Syriac. Their root is in an alternative intellectual tradition from that of the Greek Church. The transition to Arabic was then predominantly from a closely related Semitic language, not from Greek. One of the theses to explain the spread of Arabic across North Africa, but not into Persia, is that Arabic found it easier to replace other members of the Afro-Asiatic language family. I can accept that people can intuitively perceive differences of language family without a deep knowledge of said languages. In Sons of the Conquerors: The Rise of the Turkic World it is recounted that an ambassador to the court of the Hapsburg Emperor in Vienna communicated to the Sultan that apparently the locals spoke a dialect of Persian! Persian and German are of course both Indo-European languages, and set next to Turkish they may sound vaguely similar.

This thesis is plausible to me, and I have long held to it in regards to Arabic’s replacement of Aramaic. I have been told by a friend who is familiar with both languages (in addition to Hebrew) that they are rather close, and if not intelligible close enough to make language acquisition much easier. But Ostler extends the argument much further, suggesting that genetic affinity also explains the replacement of Egyptian and Berber dialects in North Africa. These are Afro-Asiatic languages, but they are not Semitic. I assume linguists do perceive similarities of character which can connect these languages, but what features span the Afro-Asiatic languages which would make language acquisition easier even at this remove of relationship? The Afro-Asiatic theory for the spread of Arabic is somewhat convenient in that it does explain the data well: Arabic has spread widely only in regions of other Afro-Asiatic languages, the exception being in Spain. And in Spain the Mozarab dialect had a stabilized existence with the Romance language of the rural areas, which eventually came back in the form of Castilian, Portuguese, etc. What Nicholas Ostler seems to be proposing is that the world of language acquisition is not flat. This is clearly true for closely related languages, but I think the thesis needs to be explored for distantly related languages from the same family. Does a native speaker of Marathi have a leg up on a Hungarian when it comes to learning Gaelic? I remain skeptical of the affirmative in that case.

So Empires of the Word outlines some broad generalizations of how languages grow, which seem born out by the record of history, and offers some more speculative theories about the importance of the cultural terrain upon which languages can flow and spread. But the narrative also lingers long on the future of the current lingua franca of our age, English. Nicholas Ostler does nothing to dismiss the omnipresence of English at the commanding heights of international culture. He reports for example that in 1994 50% of international telephone calls were between English speakers. 45% were between English speakers and those who were not English speakers! That means only 5% of international calls in 1994 were cases where people neither spoke English as their native language. I suspect that the numbers have changed a bit since then, but if that study is correct then it points to the awesome international spread of the English language. But Nicholas Ostler does not think that it will last, and his rationale seems to be the record of history, where such universal languages always fall. His next book, The Last Lingua Franca: English Until the Return of Babel outlines his thesis in detail.

And yet contra Ostler I have to suggest that perhaps this time it’s different. I do not believe that English in a unified form will dominate all. Already there has been considerable dialect drift. But the past 200 years are qualitatively different from what has come before, and there is already a revolution in communication technology. It may be that in the future languages do not crystallize as a function of geography, but perhaps more as a function of class and occupation. It does seem historically that trade lingua francas have been ephemeral in impact, and English, the language of McWorld, is the language of capital. But the modern world is much more dependent on flows of capital and commerce than the pre-modern world, the Sogdians and Portuguese were primarily vectors for high value luxury goods. Pre-modern capitalism had the air of a parlor game between the high and mighty, and was quite often in bad odor among rentier elites themselves. It is with reason that I observed above that the pace of cultural change in the past was less than what it is today. Positive feedback loops may be much more powerful than they once were, so that a “Globish” derived from English may quickly sweep away all comers, before it diversifies again.

But really I should wait for Ostler’s new book. The arguments I make here may be addressed, or I may misunderstood what I gleaned from Empires of the Word. It is as I said a story with rich and vibrant detail, much of which I glossed over, or did not address. For that Ostler’s tale is worth the time it takes to complete it. But there is I must say a lack of theoretical punch and heft. Perhaps this is just a function of the subject domain, which has too much complexity to distill down to any model of elegance or tractability. But I suspect a more rigorous analytical framework could squeeze some juice out of the enormous pile of detail which Nicholas Ostler has at his disposal. Perhaps he should read Replicated Typo.

Image Credit: Wikimedia, Ethnologue

November 16, 2009

The Isles in America

It’s easy to find maps of American ancestries, but I wanted to play around with the data, and in particularly the visualization myself. So I went to the Census and got the county level numbers. The first thing I wanted to do was look at non-Hispanic white ethnicities as a proportion of non-Hispanic whites. That would for example increase the Anglo-Saxon character of the lowland South because it would remove African Americans from the equation.

All the data was from the 2000 Census, and I simply divided the % of each European ancestry group by the non-Hispanic white percentage to reweight appropriately. Here are some correlations I found:

English X Scots-Irish = 0.34

English X Irish = 0.30

English X American = -0.20

Scots-Irish X Irish = 0.37

Scots-Irish X American = -0.25

Irish X American = -0.45

I left the Scottish and Welsh out of this because their numbers were relatively small. One of the main issues with look at the “Irish” and “American” category is that both of these are probably heavily loaded with Scots-Irish. Below the fold are some maps I generated.

Blue = above the median for the frequency of that group nationally (the median being calculated again with non-Hispanic whites only included).

Red = below the median.

The distributions of frequencies by county tend to be positively skewed, so the shading is covering a larger spectrum of frequencies in the blue than the red.

Min = 1.6%
25% = 8.5%
Median = 11%
75% = 14%
Max = 48%

Min = 0%
25% = 1%
Median = 2%
75% = 3%
Max = 10%

Min = 2%
25% = 10%
Median = 12%
75% = 14%
Max = 37%

Min = 0%
25% = 7%
Median = 14%
75% = 22%
Max = 70%

“Isles” includes Scottish & Welsh, as well as “American.”

Min = 9%
25% = 39%
Median = 44%
75% = 51%
Max = 85%

Finally, here’s a map where those of “Isles” origin are 50% or more of the non-Hispanic white population.

The shading for the “Isles” doesn’t look right. But here’s the histogram:

The median is 0.45. So that’s probably why the blue is relatively homogeneous, the distribution is negatively skewed.

