December 31, 2011

Razib Khan’s predictions for 2012

People often make “year end predictions.” I haven’t done that because I just haven’t bothered. But, it’s probably a nice way to see how full of crap you are. You can look back at how many mistakes you made, suggesting to you that you’re really a lot more ignorant of the shape of reality than you fancy yourself. So I’m going to put some predictions down right now. The title is self-centered, but I want it to be Googleable. There are two classes of predictions. The first class are those which I think have more than 50 percent chance of coming to fruition. I don’t want to pick “sure things,” because what’s the point of that? The second category is different, in that I think the chance of the outcome may be less than 50 percent, and the conventional wisdom is going to be opposite of the prediction, but I suspect the odds are better than people think. I’ll give myself “bonus points” if those come true.


>50 percent probability in my estimation

-  Mitt Romney will win the Iowa primary.

- Ötzi will have his genome published.

- A big paper will come out confirming that there has been massive (on the order of 50 percent or more) genetic turnover across Europe over the past 10,000 years.

- There will be more evidence published of “archaic admixture” events in the genomes of modern humans.

- No state will leave the Euro in 2012.

- The “great stagnation” will continue in the USA. GDP growth will not top 2.5 percent in any quarter. Unemployment will not drop below 7 percent by the end of the year.

- Housing will not bottom out in 2012 (Case-Shiller index in December 2012 will remain the same or below December 2011).

- Sprint will continue to lose ground to Verizon and AT&T in relative market share in mobile phones.

- Chrome will continue to gain share, but more at the expense of IE than Firefox. Firefox will remain within 5% of absolute current market share in December 2012 in relation to December 2011.

- There will be at least 150 references to “quantitative genomics” in Google Scholar in 2012 (vs. 70+ in 2011).

- We will see a $3,000 dollar genome (human sequence) for consumers by the end of the year.

- Time/Newsweek will write a long feature “How Facebook is over” in the last 1/3 of 2012 due to stagnation in active customer base.

- Google+ will be transformed from being a “Facebook-killer” to part of Google’s attempt to create a broader online identity (i.e., it will “fail” as a social network).

- Economic pessimism about India will become more prominent in the American media.

- The public offerings of web 2.0 companies will disappoint.

- There will be less talk about “e-books” after a peak over the summer vacations of 2012 because they will be so “normal.”

<50 percent probability, but greater probability than people think

- We will have the $1000 dollar genome by the end of 2012.

- Barack H. Obama will be reelected president.

- The Democrats will keep their Senate majority (almost perfectly correlated with the previous prediction).

- Greece will leave the Eurozone.

- China’s economic growth will be slower than expected, and will hit 5 percent in one quarter.

- 23andMe will shift away from “retail personal genomics.”

- There will be a major Islamic terrorist event in England or the United States (death toll  >10 = “major”).

- A major revision of our understanding of the archaeogenetics of the New World will be published, using ancient DNA

- The genetic architecture of hair curliness will be elucidated.

- Scientists will discover that 50% of more of the ancestry of most Africans is due to an ancient “back migration” event from Eurasian, on the order of 200 thousand years B.P. (which distinguishes Pygmies and Khoisan, who do not bear as much of this stamp).

- A major paper will be published in a high impact journal outlining the genes for major bio-behavioral differences between human populations.

- Siri will get good enough by December 2012 that people will no longer be able to play jokes on it.

- We will have a public discussion about the near future of widespread prenatal screening as part of national healthcare policy in the USA.

That’s the question a commenter poses, albeit with skepticism. First, the background here. New England was a peculiar society for various demographic reasons. In the early 17th century there was a mass migration of Puritan Protestants from England to the colonies which later became New England because of their religious dissent from the manner in which the Stuart kings were changing the nature of the British Protestant church.* Famously, these colonies were themselves not aiming to allow for the flourishing of religious pluralism, with the exception of Rhode Island. New England maintained established state churches longer than other regions of the nation, down into the early decades of the 19th century.

Between 1630 and 1640 about ~20,000 English arrived on the northeastern fringe of British settlement in North America. With the rise of co-religionists to power in the mid-17th century a minority of these emigres engaged in reverse-migration. After the mid-17th century migration by and large ceased. Unlike the Southern colonies these settlements did not have the same opportunities for frontiersmen across a broad and ecological diverse hinterland, and its cultural mores were decidedly more constrained than the cosmopolitan Middle Atlantic. The growth in population in New England from the low tends of thousands to close to 1 million in the late 18th century was one of endogenous natural increase from the founding stock.

This high fertility regime persisted down into the middle of the 19th century, as the core New England region hit its Malthusian limit, and flooded over into upstate New York, to the irritation of the older Dutch population in that region. Eventually even New York was not enough, and New England swept out across much of the Old Northwest. The last became the “Yankee Empire,” founded by Yankees, but later demographically supplemented and superseded in its western reaches by immigrants from northwest Europe who shared many of the same biases toward order and moral probity which were the hallmarks of Yankees in the early Republic.

While the Yankees were waxing in numbers, and arguably cultural influence, the first decades of the American Republic also saw the waning of New England power and influence in relation to the South in the domain of politics. This led even to the aborted movement to secede from the union by the New England states in the first decade of the century. By the time of Andrew Jackson an ascendant Democrat configuration which aligned Southern uplanders and lowlanders with elements of the Middle Atlantic resistant to Yankee cultural pretension  and demographic expansion would coalesce and dominate American politics down to the Civil War. It is illustrative that one of the prominent Northern figures in this alliance, President Martin Van Buren, was of Dutch New York background.

But this is a case where demographics was ultimate destiny. Not only were the Yankees fecund, but immigrants such as the German liberals fleeing the failures of the tumult of 1848 (e.g., Carl Schurz) were aligned with their anti-slavery enthusiasms (though they often took umbrage at the anti-alcohol stance of the Puritan moralists of the age, familiarizing the nation with beer in the 1840s). The Southern political ascendancy was simply not tenable in the face of Northern demographic robustness, fueled by both fertility and immigration. Because of overreach on the part of the Southern elite the segments of the Northern coalition which were opposed to the Yankees eventually fractured (Martin Van Buren allowed himself to be candidate for the anti-slavery Free Soil party at one point). Though there remained Northern Democrats down to the Civil War, often drawn from the “butternuts” whose ultimate origins were in the Border South, that period saw the shift in national politics from Democrat to Republican dominance (at least up the New Deal). Curiously, the coalition was an inversion of the earlier coalition, with Yankees now being integral constituents in a broader Northern and Midwestern movement, and Southerners being marginalized as the odd-men-out.

 I review all this ethno-history because I think that to a great extent it is part of the “Dark Matter” of American political and social dynamics. Americans are known as “Yankees” to the rest of the world, and yet the reality is that the Yankee was one specific and very distinctive folkway on the American scene. But, that folkway has been very influential, often in a cryptic fashion.

Both Barack H. Obama and George W. Bush are not culturally identified as Yankees in a narrow sense. Obama is a self-identified black American who has adopted the Chicago’s South Side as his community. The South Side is home to black culture which descends from those who arrived at the terminus of their own Great Migration from the American South. George W. Bush fancies himself a West Texan and a cowboy. He was governor of Texas, and makes his residence in Dallas, while much of his young adulthood was spent in Midland. But the reality is that both of these men have Yankee antecedents. This is clear in Bush’s case. His father is a quintessential Connecticut Yankee. Bush is the product of Andover Academy, Yale, and Harvard (by and large thanks to family connections). Barack H. Obama is a different case entirely. His racial identity as a black American is salient, but he grew up in one of the far flung outposts of the Yankee Empire, Hawaii. But perhaps more curiously, many of his mother’s ancestors were clearly Yankees. Obama has a great-grandfather named Ralph Waldo Emerson Dunham.

Within and outside of the United States there is often a stereotype that white Americans are an amorphous whole, a uniform herrenvolk who oppressed the black minority. This ideology was actually to some extent at the heart of the dominance of the early Democratic party before the rise of the Republicans fractured the coalition along sectional lines. In many Northern states one saw populist Democrats revoking property rights which were race-blind with universal white male suffrage.  But white Americans, and Anglo-Americans of British stock at that, were not one. That was clear by the 1850s at the latest. And they exhibit a substantial amount of cultural variation which remains relevant today.

New England in particular stands out over the long historical scale. In many ways of the all the colonies of Great Britain it was the most peculiar in its relationship to the metropole. Unlike Australia or Canada it was not an open frontier, rich with natural resources which could absorb the demographic surplus of Britain. Unlike India it was not a possible source of rents from teeming culturally alien subjects. Unlike the South in the mid-19th century there was no complementary trade relationship. In economic terms New England was a potential and incipient rival to Old England. In cultural and social terms it may have aped Old England, but its “low church” Protestant orientation made it a throwback, and out of step with a metropole which was becoming more comfortable with the English Magisterial Reformation (which eventually led to the emergence of Anglo-Catholicism in the 19th century). Like modern day Japan, and England of its day, New England had to generate wealth from its human capital, its own ingenuity. This resulted in an inevitable conflict with the mother country, whose niche it was attempting to occupy (albeit, with exceptions, such as the early 19th century, before the rise of robust indigenous industry, and the reliance on trade). Today the American republic has pushed England aside as the center of the Anglosphere. And despite the romantic allure of the frontier and the surfeit of natural resources, it is ultimately defined by the spirit of Yankee ingenuity (rivaled by the cowboy, whose violent individualist ethos seems straight out of the Scots-Irish folklore of the South, transposed to the West).

What does this have to do with genetics? Let’s go back to the initial colonial period. As I’ve noted before: the Yankee colonies of New England engaged in selective immigration policies. Not only did they draw Puritan dissenters, but they were biased toward nuclear family units of middling background. By “middling,” that probably refers at least toward the upper quarter of English society of the period. They were literate, with at least some value-added skills. This is in contrast with the Irish Catholic migration of the 19th century, which emptied out Ireland of its tenant peasants (attempts to turn these Irish into yeoman farmers in the Midwest failed, with fiascoes such as the consumption of their seed corn and cattle over harsh Minnesota winters).

So the question is this: could “middle class” values be heritable? Yes, to some extent they are. Almost all behavioral tendencies are heritable to some extent. Adoption studies are clear on that. But, is one generation of selection sufficient to result in a long term shift? First, let’s dismiss the possibility of random genetic drift and therefore a bottleneck. The one generation shift in allele frequencies due to drift is inversely proportional to effective population. If you assume that effective population is ~5,000, then the inverse of that is 0.0002. So you’d expect the allele frequency at any given locus shift by only a tiny fraction. So we have to look to selection.

Let’s do some quick “back of the envelop” calculations. We’ll use IQ as a proxy for a whole host of numbers because the numbers will at least be concrete, though the underlying logic of a quantitative continuous trait remains the same. First, the assumptions:

- Truncation selection on the trait which lops off the bottom 75 percent of the class distribution

- A correlation between the trait and genetic variation, so that you lop off the bottom 50 percent of the IQ distribution

- A heritability of IQ of 0.50

The top 50 percent of the IQ distribution has a median/mean IQ of ~110. Assuming 0.50 heritability implies half way regression back to the mean. Therefore, this model predicts that one generation of selection would entail a median IQ of 105 in the second generation, about 1/3 of a standard deviation above the norm in England.

Is this plausible, and could it result in the differences we see across American white ethnic groups? It is possible, but there are reasons to be skeptical. I think my guess of the top 25 percent of the class distribution is defensible from all I’ve read. But the correlation of this with IQ is probably going to be lower in the pre-modern era than today, where you have meritocratic institutions which channel people of different aptitudes. Second, the heritability of IQ was probably lower back then than now, because of wide environmental variance. Please note, I don’t dismiss the genetic explanation out of hand. Rather, this is a case where there are so many uncertainties that I’m not inclined to say much more than that it is possible, and that we may have an answer in the coming decades with widespread genomic sequencing.

But there’s another option, which is on the face of it is more easy to take in because so many of the parameters are well known and have been thoroughly examined. And that’s cultural selection. While we have to guess at the IQ distributions of the early Puritans, we know about the distribution of their cultural tendencies. They were almost all Calvinists, disproportionately literate. Because of its flexible nature culture can generate enormous inter-group differences in phenotypic variation. The genetic difference between New England and Virginia may have been small, but the cultural difference was wide (e.g., Yankee thrift vs. Cavalier generosity). Yankees who relocated to the South would assimilate Southern values, and the reverse (there is some suggestion that South Carolinian John C. Calhoun’s Unitarianism may have been influenced by his time at Yale, though overall it was obviously acceptable to the Deist inclined Southern elite of the period).

Before New England human societies had an expectation that there would be a literate segment, and an illiterate one. By and large the substantial majority would be illiterate. In the Bronze Age world the scribal castes had almost a magic power by virtue of their mastery of the abstruse cuneiform and hieroglyph scripts. The rise of the alphabet (outside of East Asia) made literacy more accessible, but it seems likely that the majority of ancient populations, even in literary capitals such as Athens, were functionally illiterate. A small minority was sufficient for the production, dissemination, and propagation of literary works. Many ancient books were written with the ultimate understanding that their wider “reading” was going to occur in public forums where crowds gathered to listen to a reader. The printing press changed this with the possibility for at least nominal ownership of books by those with marginal surplus, the middle class. By limiting migration to these elements with the means to buy books, as well as an emphasis on reading the Bible common to scriptural Protestants, you had a society where the majority could be readers in the public forum.

What were the positive cultural feedback loops generated? And what sort of cultural dampeners may have allowed for the new stable cultural equilibrium to persist down the centuries? These are open questions, but they need to be explored. I’ll leave you with a map of public school expenditures in 2003. In the 1840s and 1850s one of the more notable aspects of the opening of the Western frontier with the huge difference between states settled by Yankees, such as Michigan, and those settled by Southerners, such as Arkansas. Both states were settled contemporaneously, but while Michigan had numerous grammar schools, Arkansas had hardly any….

* British Protestantism has shifted several times from a more “Catholic” to “Radical Protestant” direction. Its peak in officially sanctioned Radical Protestantism was probably during the reign of Edward VI, decades before the Stuart kings (the exception being the republic)

Filed under: India,The New York Times

As I’ve joked before, The New York Times always seems to be pushing free market private sector solutions in South Asia. Many of India’s Poor Turn to Private Schools:

For more than two decades, M. A. Hakeem has arguably done the job of the Indian government. His private Holy Town High School has educated thousands of poor students, squeezing them into cramped classrooms where, when the electricity goes out, the children simply learn in the dark.

Parents in Holy Town’s low-income, predominantly Muslim neighborhood do not mind the bare-bones conditions. They like the modest tuition (as low as $2 per month), the English-language curriculum and the success rate on standardized tests. Indeed, low-cost schools like Holy Town are part of an ad hoc network that now dominates education in this south Indian city, where an estimated two-thirds of all students attend private institutions.

“The responsibility that the government should shoulder,” Mr. Hakeem said with both pride and contempt, “we are shouldering it.”

The issue seems to be that in terms of what it provides the masses India’s public sector is an unmitigated joke verging on disaster. What India needs is greater federalism, as it seems that coordination from the center is just not possible with all the special interests tugging at it.

Filed under: Blog,Year End

Filed under: Blog,End of the Year

Filed under: Blog,Comments,Year End

Filed under: Blog

December 30, 2011

Charitable donations for the long term

Filed under: Blog,Charity,Wikipedia

My friend Holden Karnofsky always pings me at this time of the year. Holden is co-founder of GiveWell. If you’re curious, you can look up more on the outfit yourself, I’ve talked about it enough over the years for you to get why I’m interested and a supporter. Holden is a numbers and data driven guy, and it turns out that 25% of the money given through their website last year was on December 31st. Here are their top charities.

In purely selfish news (yes, I’m a heavy user) Wikipedia is also in need of cash. And yes, I give! (though that doesn’t stop the constant stream of begging headshots)

Vocab by ethnicity, region, and education

Filed under: data,Data Analysis,GSS,I.Q.,Regionalism

A questioner below was curious if vocabulary test differences by ethnic and region persist across income. There’s a problem with this. First, the INCOME variable isn’t very fine-grained (there is a catchall $30,000 or greater category). Second, it doesn’t seem to control for inflation. But, there is a variable, DEGREE, which asks the highest level of education attained. I used this to create a “college” and “non-college” category (i.e., do you have a bachelor’s degree or not). Because of sample size considerations I removed some of the ethnic groups, but replicated the earlier analysis.

Below are two tables. One shows the mean vocab score for region and ethnicity (for whites) for those without college educations, and another shows those with college educations. I decided to generate a correlation over the two rows, even though it sure isn’t useful as a quantitative statistical measure because of the small number of data points. Rather, I just wanted a summary of the qualitative result. The short answer is that the average vocabulary difference seems to persist across educational levels (the exception here is the “German” ethnicity).

Mean WORDSUM Score by Ethnicity and Region
No college education




German 6.05 5.81 5.79 6.11
Eastern Europe 6.17 6.16 6.18 6.29
Scandinavian 6.35 5.97 6.23 6.35
British 6.6 6.21 6.02 6.57
Irish 6.66 5.83 5.69 6.58
Italian 6 5.85 5.8 6.18

College educated




German 8.03 7.48 7.63 7.33
Eastern Europe 7.7 7.37 7.5 8.09
Scandinavian 8.5 7.82 7.86 7.92
British 8.44 8.06 7.76 7.95
Irish 8.03 7.79 7.39 7.59
Italian 7.45 7.75 7.6 7.87

Correlation of college and non-college
German 0.08
Eastern Europe 0.92
Scandinavian 0.57
British 0.70
Irish 0.57
Italian 0.40

The bush & the bramble of the human family

Filed under: Anthroplogy,Human Evolution

I wonder if in future years we’re going to look at “species debates” in the context of human evolution like we look at counting angels on the head of a pin. Over at BBC News Clive Finlayson has a rambling opinion piece up, Has ‘one species’ idea been put to bed? Finlayson, the author of The Humans Who Went Extinct: Why Neanderthals Died Out and We Survived, doesn’t seem to have a tightly focused point and the end of it all (I think warranted, considering how unsettled this area is). But he does conclude:

And a major conference is planned for September next year when experts from all over the world will meet in Gibraltar to revise our ideas about “the human niche”. After decades of bad press we are finally getting round to humanizing the enigmatic Neanderthals.

In my post below I argue that it’s most useful to reconceptualize “human” as an ecological niche, rather than a descent group. All the confusion as to whether Neandertals, or any other group of divergent hominins, were, or weren’t, “humans like us,” exists in the context of the idea that “humans like us” are a very specific and sui generis clade with special traits. I think “we” need to get a little off our high horse here.

A few years ago Bruce Lahn got a lot of scorn for positing the idea that different modern human lineages might have been in the process of speciating, at least before the Columbian Exchange and Globalization. Whether the concept is correct or not, I suspect part of the issue is that speciation implies that some human lineages are de-humanized, because there can be only one human lineage. I think this is wrong. I obviously think there’s been a lot of abuse of postmodernism, especially when it comes to natural science, this is an area where human concerns rather than objective reality have historically been drivers of many debates. We can see that clear from the present looking back to the past, but decades from now I suspect that we’ll be subject to the same hindsight wisdom.

Eggs: quantity and quality

Filed under: Fertility,Medicine

In my post below on selection for the “better” zygote Michelle observes that “This would be relatively easy for the father, not so much for the mother.” I took her to mean either of two things,

1) Extraction of eggs is a major surgical affair. Extraction of sperm is not.

2) Males generally have many more sperm to contribute than females.

The latter issue made me go look for data on human females, by age. The paper A systematic review of tests predicting ovarian reserve and IVF outcome had what I was looking for. First, let’s review the cumulative distribution of fertility curves for women:

The way I read the figure 50% of women are sterile at 41. 50% begin their fertility drop at 31. Note that a small, but significant, minority of women are already sterile by age 35. People talk about fertility curves, but less weight is given to the fact that the curve varies in terms of its chronology!

Second, let’s look at the number and quality of ovarian follicles over time (they correspond to number of incipient eggs):

This figure is not easy to read. But you can see that at age 20 there are ~100,000 follicles. That number seems to drop by a little less than half by 30, and is at 20,000 by 40. But by this point 25 percent are of “poor quality.”

How do relatives correlate in traits?

Filed under: Correlation,Height,Quantitative Genetics

The Pith: Even traits where most of the variation you see around you is controlled by genes still exhibit a lot of variation within families. That’s why there are siblings of very different heights or intellectual aptitudes.

In a post below I played fast and loose with the term correlation and caused some confusion. Correlation is obviously a set of precise statistical terms, but it also has a colloquial connotation. Additionally, I regularly talk about heritability. Heritability is in short the proportion of phenotypic variance which can be explained by genetic variance. In other words, if heritability is ~1 almost all the variation in the trait is due to variation in genes, while if heritability is ~0 almost none of it is. Correlation and heritability of traits across generations are obviously related, but they’re not the same.

This post is to clarify a few of these confusions, and sharpen some intuitions. Or perhaps more accurately, banish them.


The plot above shows relationship between heights of fathers and heights of sons in standard deviation units (yes, I removed some of the values!). You see that the slope is ~0.45, and that’s the correlation. At this point you probably know that heritability of height is on the order of 0.8-0.9. So why is the correlation so low? A simple biological reason is that you don’t know the value of the mothers. If the parents are not strongly correlated (assortative mating) obviously the values of the sons is going to diverge from that of the father. That being said, you probably notice that the correlation here is about 1/2 that of the heritability you know has been confirmed in the literature. That’s no coincidence. One way to estimate heritability is to take the slope of the plot of offspring vs. parents, and multiply that by 2. Therefore, the correlation (which equals the slope) is 1/2 × h2, where h2 represents heritability.

Correlation (parent to offspring) = 1/2 × h2

1/2 turns out to be the coefficient of relatedness of a parent to offspring. I’ll spare you the algebra, but suffice it to say that this is not a coincide. Where r = coefficient of relatedness the correlation between sets of relatives on a trait value is predicted to be:

Correlation (relative to relative) = r × h2

Where r is simply the coefficient of relatedness across the pair of relatives. Here are some values:

r relationship
0.5 (½) parent-offspring
0.25 (¼) grandparent-grandchild
1 identical twins; clones
0.5 (½) full siblings
0.25 (¼) half siblings
0.125 (⅛) first cousins

Here’s the kicker: the correlation coefficient of the midparent value and the offspring value does not equal the slope of the line of best fit. This is why I had second thoughts about using the term “correlation” so freely, and then switching to heritability. The formula is:

Correlation (midparent to offspring) = 1/√2 × h2

So the correlation of midparent to offspring is 0.71 × heritability.

Why is this something you might want to know? I think people are sometimes confused about how an extremely heritable trait, like height, where you’re given heritability values of 0.90, still yields families with such a wide range of heights. Well, recall that the coefficient of relatedness among siblings is 1/2. So their correlation is going to be the same as with parents. Therefore, the magnitude will be half that of the heritability. A correlation of 0.45 is not small, but neither is it extremely tight. The histogram below illustrates this with the above data set. The values are simply the real difference between fathers and sons:

December 29, 2011

Vocabulary score by race, ethnicity, and region

Filed under: Data Analysis,Demographics,GSS,WORDSUM

Mike the Mad Biologist has a post up, A Modest Proposal: Alabama Whites Are Genetically Inferior to Massachusetts Whites (FOR REALZ!). The post is obviously tongue-in-cheek, but it’s actually an interesting question: what’s the difference between whites in various regions of the United States? I’ve looked at this before, but I thought I’d revisit it for new readers.

First, I use the General Social Survey. Second, I use the WORDSUM variable, a 10 question vocabulary test which has a correlation of 0.70 with general intelligence. My curiosity is about differences across white ethnic groups by region. To do this I use the ETHNIC variable, which asks respondents where their ancestors came from by nation. I omitted some nations because of small sample size, and amalgamated others.

Here are my amalgamations:

German = Austria, Germany, Switzerland

French = French Canada, France

Eastern Europe = Lithuania, Poland, Hungary, Yugoslavia, Russia, Czechaslovakia (many were asked before 1992), Romania

Scandinavian = Denmark, Norway, Sweden, Finland (yes, I know that Finland is not part of Scandinavia, Jaakkeli!)

British = England, Wales, Scotland

Next we need to break it down by region. The REGION variable uses the Census divisions. You can see them to the left. I combined a few of these to create the following classes:

Northeast = New England, Middle Atlantic

Midwest = E North Central, W North Central

South = W S Central, E S Central, South Atlantic

West = Pacific, Mountain

The key method I used is to look for mean vocabulary test scores by ethnicity and religion. I also later broke down some of these ethnic groups by religion. Finally, all bar plots have 95 percent confidence intervals. This should give you a sense of the sample sizes for each combination.

First let’s break it down by race/ethnicity and compare it by region to get a reference:

Next, the main course:

Finally, let’s separate by religion for Germans and Eastern Europeans:

I include the last plot because these reports of nationality have to be taken with a consideration for the structure they may mask. People whose ancestors from Poland in the United States fall into two large categories: people of Jewish heritage whose identity as ethnic Poles was contested (recall that Jews often spoke Yiddish as their first language, a Germanic language), and Roman Catholic Slavs. I suspect many of those in the “None” category are also Jews by culture, if not religion.

Second: there is a tendency of people of all ethnic groups to have lower vocabulary scores if they are from the South or Midwest. This tendency is in many cases outside of the 95 percent confidence interval. It’s especially striking in the three groups with huge samples sizes in all regions: Germans, Irish, and British. Irish here includes both Scots-Irish and those of Irish Catholic background. Not only are the sample sizes for these groups large, but the roots of these groups in some of these regions go rather far back. In particular, the division between the people of British ancestry goes back centuries in the North vs. South divide.

How to understand this? There are a lot of complicating factors.  But as outlined in Albion’s Seed and The Cousins’ Wars the divisions between the Anglo-Celtic folkways runs deep and long. If a time traveler from the 18th century arrived in the United States today and were asked which region was the heart of intellectual ferment they would correctly guess New England. Early Puritan New England was the first universal-literacy society in the world. This was to some extent a matter of conscious planning. The leaders of the New England colonies enforced limitations upon who could emigrate to their dominion. Religious exclusions and persecutions in this region are well known, but there was also a policy of rejecting the settlement of those who were perceived to be possible burdens upon the community. New England then selected for a middle class migration out of East Anglia and the port towns of southwest England. But the fathers of the early colony also rejected the transfer of the privileges of the blood nobility from the motherland, thereby throwing up a barrier to the migration of the aristocracy.

In contrast the lowland South received a more representative selection of the British class strata. The younger sons of the British nobility and self-styled gentlemen arrived to make their mark, as did those who became indentured servants and even slaves. A class society on the model of southwestern England recapitulated itself in this region. As for the uplands, what became Appalachia, an influx of Scots-Irish came to dominate the scene by the mid of the 18th century, disembarking in Philadelphia, and pushing down the spine of the high country down to the Deep South.

Conflicts between these “Anglo” groups framed the terms of debate over the 18th and 19th centuries. They were to some extent at the root of the Age of Sectionalism. Today because of the salience of race, and the prominence of the later wave of migration in the late 19th and early 20th century which remained vibrant in living memory for mod, these early divisions have moved out of sight. But they still remain. The difference between Germans in Texas and the Anglos of Southern extraction remains to this day, but note that Germans exhibit the same regional differences in vocabulary score as Anglos. Why? This may be a case where the original cultural substratum has an outsized impact (the dialect of eastern New England, made famous by the Catholic Irish of Boston, is descended from East Anglian English!).

Of course there might be a genetic difference. Intelligence is a quantitative trait, so it would be trivial to generate two populations which are genetically similar, but very different in trait value, simply through selection. In the 1630s ~20 thousands Puritans settled New England. For various reasons there was very little migration over the next century and a half. By 1780 New England’s population was 700,000, almost all through natural increase (not only was New England the world’s first universal literacy society, but its fertility was the highest in the late 17th century).

Finally, there’s the issue of disease and pathogen load. Endemic hookworm infection does seem likely to have made Southerners, of both races, relatively indolent and lethargic in comparison to Northerners. Who knows what pathogens simply fall below our radar?

Overall I think that a more fine-grained and detailed exploration of these topics is warranted. Our public discussion is too coarse, and data-thin.

How a “designer baby” might just work

Filed under: Bioethics,Quantitative Genetics,Quantitative Genomics

In earlier discussions I’ve been skeptical of the idea of “designer babies” for many traits which we may find of interest in terms of selection. For example, intelligence and height. Why? Because variation on these traits seems highly polygenic and widely distributed across the genome. Unlike cystic fibrosis (Mendelian recessive) or blue eye color (quasi-Mendelian recessive) you can’t just focus on one genomic region and then make a prediction about phenotype with a high degree of certainty. Rather, you need to know thousands and thousands of genetic variants, and we just don’t know them.

But I just realized one way that genomics might make it a little easier even without this specific information.

The method relies on the phenotypic correlation between relatives. Even before genomics, and genetics, biometricians could generate rough & ready predictions about phenotypic values based on parental values. The extent of the predictive power depends upon the heritability of the trait. A trait like height is ~80-90% heritable. That means that ~80-90% of the variation in the population of the trait is due to genes. The expected value of your height is strongly conditional upon the heights of your parents.

That’s all common sense. What does this have to do with genomics? Simple. You are 50% identical by descent with each parent. That means half your gene copies come from your mother and half from your father. You can’t change that unless you’re a clone. But, because of the law of segregation and recombination you are not necessarily 25% identical by descent from each grandparent! The expectation is that you’re coefficient of relatedness is 25%, but there is variation around this. A given parent either contributes their own paternal or maternal homologous chromosome. There’s a 50% chance that you’re going to inherit one or the other across your chromosomes, of independent probability. You have 22 autosomal chromosome pairs (non-sex chromosomes), so there’s a strong chance that you won’t be equally balanced between your opposite sex paternal and maternal grandparents (e.g., you have more genes identical by descent from your paternal grandfather than paternal grandmother).* Second, recombination is also going to generate new combinations. In the generation we’re concerned about this will work against the dynamic we’re relying on, by swapping segments across homologous chromosomes from the parents’ mother or father.

The ultimate logic here is to select for zygotes or gametes which are biased toward the grandparents with phenotypic values which you are interested in. To give a concrete example, if you have a parent who is moderately tall, whose own father was very tall, while the mother was somewhat short, and you want the tallest possible child, you’ll want to select zygotes with the most gene content identical by descent with the tall grandparent. The point isn’t to pick specific genetic variants, you don’t need to know that. All you know is that the tall grandfather probably had genes which resulted in a predisposition toward being tall. So just make sure that the grandchild has as much of that grandparent “in them.”

I still don’t know if this is going to be cost effective in the near term. But I began to think of it because in the near future I’ll be checking the genotype of a child who has a full pedigree of 1,000,000 SNPs of their parents and grandparents.

* Modeling it as a binomial, about 1 in 7 cases will have the expected 11 chromosomes from a focal grandparent. The standard deviation is more than 2 chromosomes. You need to have about 100 zygotes to expect to get any individuals who are 5 chromosomal units away from the expected value (i.e., the individual is 10-15% instead of 25% one grandparent, or 35-40%). Obviously you need more to be assured of getting zygotes of that value. And I neglected recombination, which would work against this, by swapping genomic regions….

Filed under: Cosmic Variance,Top Posts — Sean Carroll @ 11:33 am

‘Tis the season when bloggers, playing out the string between Xmas and New Year’s, fill the void with greatest-hits lists from the year just passed. But a question inevitably arises: how does one decide which posts to include? There are many different criteria, and preferring one to another might lead to very different lists. This is what’s known as the measure problem in blogospheric cosmology.

This year I’ve decided to confront the problem pluralistically. Thus: here we have five different Top Five lists, chosen according to completely different criteria. Let us know if your favorite Cosmic Variance post of the year somehow managed to not be on any of the lists.

First, the most crude and common measure, the posts with the most page views this year.

Next up, an equally quantitative and misleading measure of popularity: the top five posts by number of comments.

Since I know they won’t do it themselves, here are my five favorite posts by the CV co-bloggers:

And here are my top five favorite guest posts, in a very strong year:

Finally, here are my top five favorite posts by me, excluding the ones that made the first two lists. Be thankful I was able to restrain myself to only choosing five.

A successful year overall — I think Sept/Oct/Nov of 2011 were our highest-traffic months of all time. Here’s to seeing you all in 2012!

Noninvasive tests for Down Syndrome

Filed under: Bioethics,Down Syndrome

I’ve mentioned this before, but I thought I’d pass on the latest report on MaterniT21, the prenatal noninvasive Down Syndrome test. Currently it has a $235 copay for women with insurance. As of now only a few percent of the ~5 million pregnancies in the USA are subject to amnio or c.v.s. This procedure may result in the screened proportion going from ~1 percent to ~50 or more percent (though the firm that is providing this can only process ~100,000 tests per year as of now). I stumbled upon this after doing a follow up on my post, Would you have your fetus genetically tested? Interestingly the proportions who would get tested doesn’t differ that much between demographics.

And the outcomes can sometimes surprise. A story in the Columbus Dispatch relates the story of a couple who kept their daughter, who tested positive for Down Syndrome. They had originally decided that if the tests came back positive the would terminate. In contrast, the nurses relate that one strongly anti-abortion couple at the beginning of the process seems to have terminated. Right now 1 in 700 pregnancies result in Down Syndrome.

December 28, 2011

Basque genetic distinctiveness (again)

Filed under: Admixture,Basques,Personal genomics

With all the talk about Basques I decided to do my own analysis with Admixture. Dienekes gave me a copy of his IBS file, which has all the 1000 Genomes Spanish samples, including Basques. I merged it with the HGDP sample, which has French Basques (just “Basques” in the plots below) and French non-Basques. I pruned most of the populations, but kept the Mozabites, which are a Berber group from Algeria. The number of markers was ~350,000, and I ran it up to K = 8, or 8 component populations. I stopped there because the components were starting to break up in a very choppy manner.

In general I do think that the idea that non-Basque Spaniards have Moorish genetic input seems supported. It isn’t definitive though. And you have to be careful, there are lower parameter values where Sardinians seem to have an affinity with Mozabites to a great extent, even more than Spaniards. But that disappears as you move up the number of K’s. But who is to say which K is the correct K? The consistent Sub-Saharan African among non-Basque Spaniards (also evident in the Behar et al. data set) component probably convinces me that there was a Moorish impact, since these are likely to have come with the Islamic conquest, and not Phoenicians.

All the files from the Admixture run (and csv files with tabular results) are here.

The poverty of multiculturalist discourse

As I’ve noted in this space before many of my “web friends” and readers are confused why I call myself “conservative.” This is actually an issue in “real life” as well, though I’m not going to get into that because I’m a believer in semi-separation of the worlds. I’ll be giving a full account of my political beliefs at the Moving Secularism Forward conference. A quick answer is that I’m very open to voting for Republicans, and have done so in the recent past. And, my lean toward Mitt Romney* in the current cycle is probably obvious to “close readers.” But I’m not a very “political person” in the final accounting when it comes to any given election. I didn’t have a very strong reaction to the “wave” elections of 2006, 2008, and 2010, except that I was hopeful but skeptical that Democrats would actually follow through on their anti-war rhetoric (I’m an isolationist on foreign policy).

Rather, my conservatism, or perhaps more accurately anti-Left-liberal stance, plays out on a broader philosophical and historical canvas. I reject the very terms of much of Left-liberal discourse in the United States. I use the term “discourse” because for some reason the academic term has replaced the more informal “discussion” in non-scholarly forums. And that’s part of the problem. I am thinking of this because of a post by Nandalal Rasiah at Brown Pundits commenting on a piece over at Slate, Responding to Egregious Attack on Female Protester, Egyptian Women Fight Back. Whether conventional or counter-intuitive Slate is a good gauge of “smart” Left-liberal non-academic public thought. Nandalal highlights this section:


While it’s always dangerous to analyze the psychology of a different culture, I think it is safe to say that in this case, a kind of social contract has been irreparably broken. Based on the statements reported in the Times and in other media accounts, the women of all ages and political/religious orientations who took to the streets yesterday felt that the violation against this poor woman was a violation against them all. A repressive, virulently patriarchical society like the one the Egyptian military apparently wishes to foment in its country can only function with the tacit (whether coerced or freely given) consent of the women it oppresses. But when those same men who demand chastity, modesty, and all the rest prove themselves to be hypocrites by violently demeaning women in the streets, the silence is bound to be broken.

There are lots of implicit assumptions lurking in this one paragraph. Before, excuse the word, deconstructing it, I highly recommend D. Jason Slone’s Theological Incorrectness: Why Religious People Believe What They Shouldn’t to get where I’m coming from. It has one of the most concise and well written critiques of the “Post Modern”** obfuscation which has crept into many disciplines purporting to describe, analyze, and comment upon the human condition. Slone’s short academic book is obviously about religion, from a cognitivist perspective, but his prefatory section is a survey of the diseases which ail cultural anthropology today (for a longer take see Dan Sperber’s Explaining Culture: A Naturalistic Approach).

First, the very idea that the Egyptian military is fomenting patriarchy seems descriptively false. I thought perhaps I didn’t understand what foment connoted, so I looked it up. The reality is that Egyptian society was, and is, virulently patriarchal. I’ve talked about this in detail before. 54 percent of Egyptians support the enforcement of gender segregation in the workplace by law (there is no sex difference on this by the way). The Egyptian military may be a authoritarian force in the country which does foment religious conflict and patriarchy, but the key is to observe that this leverages the pre-existent tendencies of the society. Over its history the Egyptian military, and the political and economic elite, have been forces for Westernization, on the whole. This is obvious when you observe that in a democratic election Egyptians are giving 2/3 of their vote to Islamist parties, and 25 percent of the vote to Salafist parties who wish to impose a theocratic regime immediately!

Second, we need to reconsider whether it was, and is, the repeated sexual assaults upon women which are the necessary root of the anger. Sexual harassment of women on the street has long been common in Egypt. 98 percent of foreign women and 83 percent of Egyptian women report it, it seems unlikely that this is a phenomenon of a small minority of men who are violating a social contract (on this specific issue anger at the military combined with the power of media are probably the necessary causes at the outrage to this action). Mona Eltahawy has spoken at length about her assault at the hands of the authorities, but in interviews she also occasionally mentions that prior to the central incident there were instances of sexual harassment which she experienced from fellow protesters! One reason that many women in the Muslim world give for supporting Islamist parties is that these parties promise to enforce protections of women against the predatory behavior of men in societies where female honor is simply a consumption good when that female is not a relative.

So the inferences made from the contemporary events in Egypt in this case are faulty. But they’re interesting because the problem is so common. Why? You can’t make sense of this unless you examine the broader theoretical framework that people are operating within to generate inferences. A nod is given to this when the author states that it is “always dangerous to analyze the psychology of a different culture.” I think this has a positive descriptive dimension, and a normative. The positive descriptive dimension is that in scholarship one has to be careful to not allow one’s own subjective perspective to cloud objective judgments. Else, one may generate a false model of the world. This means setting aside one’s own values framework for the purpose of further analysis. Such a stance has not been the norm throughout human history. The didactic tone of Tacitus is much more typical than the cooler detachment of Thucydides. The use and abuse of scholarship for the aims of social and political ends are well known.

The problem occurs when these common sense guidelines in academics transform themselves into ever expanding relativistic bounds of discourse, incoherently in contrast with the strong normative orientations of the expositors of these same theoretical frameworks. In turning away from the bias of the past, there is now a bias which has inverted itself. There is a tendency to be careful about analyzing or criticizing other cultures, because that is “dangerous.” Why? Well, would you want to be an “Orientalist”? But you are also careful to demarcate other cultures in a way suitable to your preferences for the purposes of rooting out “injustice.” Would the author of the Slate piece be wary of critiquing the Fundamentalist Church of Jesus Christ of Latter-Day Saints? This endogamous sect is certainly apart from the rest of American culture. In fact, with its extreme patriarchy and polygamy it resembles the ideals of some non-Western societies. How about the culture of the American South? There’s no denying this is a distinctive region in folkways. Would one think it is dangerous to analyze or critique the distinctive attitudes toward relations between the races in his region, whose divergence from the North dates back to colonial times?

Some of this is clearly just a matter of race. Though people speak of “culture,” what they often act out is the idea that non-white races have different cultures by nature in an essential sense, and so must be critiqued with a softer touch, or greater sensitivity, than whites with a distinctive culture. Conservative white Southerners and Fundamentalist Mormons are clearly distinctive in culture from the typical Northern Left-liberal, but that does not shield them from a critique derived from a difference in perspective. The implicit idea lurking beneath the surface is that the white race is subject to a particular standard of cultural expectation, and criticism meted out serves to elevate dissenters to that higher standard, which diminishes “oppression” and “injustice” (quotes in this case because I feel that the terms are used many to further very narrow political projects, to the point where they’re heavily debased and almost without content as ends as opposed to means). In contrast, the situation is different with non-whites, who must be left to find their own direction, or more obliquely critiqued.

To a great extent this is a caricature, but the underlying dynamic is real. For example, a few years back a Harvard Muslim chaplain was caught contextualizing, and defending, laws enforcing the death penalty for apostasy from Islam. Upon further inspection from an intellectual perspective I can see where he was coming from. In scholarly or academic settings I think one can have a real discussion about this issue, even if one disagrees with the presuppositions. I say this as someone who is technically a Muslim apostate (my father is Muslim, by which definition some Muslims would define me as such). Here is the section which I found amusing though:

I would finally note that there is great wisdom (hikma) associated with the established and preserved position (capital punishment) and so, even if it makes some uncomfortable in the face of the hegemonic modern human rights discourse, one should not dismiss it out of hand. The formal consideration of excuses for the accused and the absence of Muslim governmental authority in our case here in the North/West is for dealing with the issue practically.

This individual is a Harvard graduate, so of course he would understand what “hegemonic modern human rights discourse” is alluding to, and the use of therm “discourse” suggests his familiarity with the academic style dominant today, despite his defense of capital punishment of apostates from Islam under Islamic governments. Despite the trotting out of appropriate terminology, obviously the individual in question believes in a hegemonic discourse. He accepts that Islam is the way, the truth, and that under ad Islamic regime those who are Muslim who turn from the truth may be put to death by the authorities. If a conservative Protestant chaplain at Harvard was caught privately defending the death penalty for apostasy (which was enforced by Protestants in Scotland as late as 1700) there wouldn’t be a discussion or contextualization; they’d be universally condemned and fired (in large part because killing apostates from religion is no longer part of the wider Christian set of norms, as opposed to the world of Islam where the concept is widely accepted).

The problem with the bleeding over of academic “discourse” into the public forum is that it obfuscates real discussion, and often has had a chilling effect upon attempts at moral or ethical clarity. Unlike the individual above I am skeptical of moral or ethical truth in a deep ontological sense. But I have opinions on the proper order of things on a more human scale of existence. You don’t have to reject the wrongness of a thing if you reject the idea that that thing is wrong is some deep Platonic sense. I can, in some cases will, make the argument for why some form of the Western liberal democratic order is superior to most other forms of arranging human affairs, despite being a skeptic of what I perceive to be its egalitarian excesses. I can, and in some cases will, make the argument for why legal sexual equality is also the preferred state of human affairs. But to have this discussion I have to be forthright about my norms and presuppositions, and not apologize for them. They are what they are, and the views of those who disagree are what they are.

An academic discourse tends to totally muddy a clear and crisp discussion. The reality is that most Egyptians have barbaric attitudes on a whole host of questions (e.g., ~80 percent of Egyptians favor the death penalty or apostasy from Islam). It was not surprising at all that the majority of the Egyptian electorate supported parties with reactionary cultural political planks; because the classification of these views as “reactionary” only makes sense if you use as your point of reference the Westernized social and economic elite. The majority of Egyptians have never been part of this world, and for them upward mobility has been accompanied by a greater self-consciousness of their Islamic identity.

This reality is not comforting to many, and so there has been an evasion of this. If we accept, for example, in the hegemonic superiority of sexual equality, should we not impose the right arrangement upon those who oppress women? This is a serious question, but the fear of engaging in “dangerous” analysis in the “discourse” allows us to sidestep with this question. Rather, by minimizing the concrete realities of cultural difference and the depths of their origin, Egyptians are easily transformed into Czechs in 1989 with browner skins and a Muslim affiliation. This is a totally false equivalence. As Eastern Europeans go the Czech population is atypical in its secularism and historical commitment to liberal democracy (one could argue the weakness of the Catholic church goes as far back as the Hussite rebellion and the later suppression of Protestantism by the Habsburgs). While other post-World War I polities switched toward authoritarianism in the inter-war period, the Czechs retained a liberal democratic orientation until the Nazi German invasion. After the collapse of Communism they reverted back to this state. Notably, extreme nationalist parties with anti-democratic tendencies have come to the fore in most post-Communist states, but not so in the Czech Republic.

The irony here is that an academic position which espouses the deep incommensurability of different societies and cultures in terms of their values, rendering inter-cultural analysis or critique suspect, has resulted in the domain of practical discussion a tendency to recast inter-cultural differences of deep import into deviations or artificialities imposed from the outside. In this particular case that artificiality is the Egyptian military, but in most cases it is Western colonialism, which has an almost demonic power to reshape and disfigure postcolonial societies, which lack all internal agency or direction. This is simply not the true state of affairs. The paradoxical fact is that there is commensurability across very different cultures. You can understand, analyze, and critique other societies, if imperfectly. For example, I can understand, and even agree with, some of the criticisms of Western society by Salafist radicals for its materialism and excessive focus on proximate hedonism. The Salafists are not aliens, but rather one comprehensible expression of human cultural types. But that does not deny that I find their vision of human flourishing abhorrent. I understand it, therefore I reject it.

As I state above my views on foreign policy tend toward isolation. Despite the fact that I find the actions of many governments and value of many societies barbaric, and believe that the way of life expressed by Western liberal democratic societies furthers human flourishing more optimally, I do not believe it is practical or productive to force other societies to align their values with ours in most cases.*** In other words, I accept that the world is currently going to operate with a multicultural order. This does not mean that I accept multiculturalism, where all cultures have “equal value.” That idea is incoherent when it is not trivial. Such a framing is useful and coherent in a scholarly context, where Epoché is essential. A historian of Nazi Germany constantly consumed by their disgust and aversion to the regime which is the subject of their study would be a sub-optimal historian. Such disgust and aversion is right and proper, but for scholarship there must be a sense that one must moves that to the side for the purposes of analysis and description.

But most people are not scholars. They are not engaging in discourse, but having a discussion. Scholarly theories of modes of inquiry are often totally inappropriate for proximate political policy discussions. Normative biases and methodological commitments undergo peculiar transformations, and inevitably one has to confront the fact that much of what is meant or intended becomes opaque, embedded in abstruse phraseology and intelligible only to initiates in the esoteric knowledge. The hybrid of the Post Modern inflected scholar and public intellectual is ultimately a gnostic sophist of the highest order, transmuting plain if unpalatable truths about the world into a murky cultic potion.

Addendum: Many people claim that the Roman or Ottoman Empires, to name a few, were multicultural. They were in a plain reading of the term, but not in a way that people who espouse multiculturalism would recognize. In both these polities there was a hegemonic social and political order, and difference was tolerated only on its terms. For example, the Romans destroyed the Druids in Gaul and Britain. Why? One reason given, which we would probably view favorably, was that the Druids were practicing human sacrifice, which the Romans found objectionable. But another more material reason is that the Druids were natural loci for political and cultural resistance against the Roman hegemony. Similarly, the Ottomans had an elaborate system of millets which organized the different religious groups of the polity, but there was never any doubt that all were subordinate to Ottoman Muslims. Those social-religious groups which were classed as outside the pale for various reason, such as the Druze, were persecuted and not tolerated. Those which were tolerated, such as the Orthodox Christians, needed to be respectful of their subordinate position in the system. These tendencies can be generalized to all multiculturalist polities, which inevitably had a herrenkultur.

* No, I don’t think Ron Paul has a chance even if he wins Iowa. Though I do think he’s affected the whole political landscape, and that’s probably what he was looking for in any case.

** The quotations because the term is more one of aspersion than a real pointer to a specific and discrete movement at this point.

*** I make a distinction between barbarism, which is a different way of being, and savagery, which is an unacceptable way of being. The modern world has accepted that slavery is savage, and not tolerable in any polity. In contrast, the fact that women in Saudi Arabia are effectively rendered property of their male relatives is barbaric, but not objectionable enough that it must be eliminated through force.

December 27, 2011

The decline of Digg, the rise of reddit

Filed under: Digg,Reddit,Slashdot,Technology

This is probably old news to you, and I’ve read about Digg’s problems in the tech media, but I just realized how much reddit has eclipsed Digg in referral traffic. I’ve always gotten way more attention from reddit (some science bloggers have told me that reddit readers are a “smarter set”), but when I did get Digg bumps they were often of greater magnitude. No more. Not only are referrals from Digg much more rare than they used to be, but they aren’t as significant as reddit.

So of course I checked out Google Trends:

Even StumbleUpon has now surpassed Digg in search queries. Was Digg the MySpace to reddit’s Facebook? And of course Slashdot keeps going…. (Slashdot is so well established that I doubt many people are “searching” for it, so these trends probably underestimate its reach and influence).

The Genetical Theory of Natural Selection

I flog R. A. Fisher’s The Genetical Theory of Natural Selection a fair amount on this site. You don’t need to understand everything in the book, nor do you have to agree with everything in it, but it is a great point of departure toward understanding evolutionary genetics. I’ve noted that you can get it free in PDF format. But if you want to browse it online in a easier format, here you go:

(original link courtesy of Unz.org)

Older Posts »

