Razib Khan One-stop-shopping for all of my content

December 13, 2012

Population projections 50 years into the future are fantasy

Filed under: Demographics — Razib Khan @ 8:17 am

There’s another Census Projection out. Yes, I understand that the character of the children born today is going to have obvious impacts on the nature of the population 50 years from now, but we really need to heed the stupidity of past projections. Here’s a piece from 1930, A Nation of Elders in the Making:

To explain convincingly why we believe that we shall certainly not have more than 185,000,000 people here in 2000 A.D. and why we further believe that our population may cease to grow before that time, it is only necessary to make a rapid survey of our national trend of births and deaths….

For what it’s worth, the population of the USA in 2000 A.D. was ~280 million. The Baby Boom + massive immigration = revisions to projections.

November 11, 2012

Religion determines politics for Asian Americans

Filed under: Asian Americans,Data Analysis,Demographics,Election — Razib Khan @ 2:38 pm

I was at ASHG this week, so I’ve followed reactions to the election passively. But one thing I’ve seen is repeated commentary on the fact that Asian Americans have swung toward the Democrats over the past generation. The thing that pisses me off is that there is a very obvious low-hanging fruit sort of explanation out there, and I’m frankly sick and tired of reading people ramble on without any awareness of this reality. We spent the past few months talking about the power of polls, and quant data vs. qual (bullshit) analysis, with some of my readers going into full on let’s-see-if-Razib-is-moron-enough-to-swallow-this-crap mode.

In short, it’s religion. Barry Kosmin has documented that between 1990 and 2010 Asian Americans have become far less Christian, on average. Meanwhile, the Republican party has become far more Christian in terms of its identity. Do you really require more than two sentences to infer from this what the outcome will be in terms of how Asian Americans will vote?

Below I took the data from Pew’s Religious Identification Survey in terms of how all Americans lean politically based on religion, and compared it to how Asian Americans lean based on religion.


All ...

August 29, 2012

The future of the three “Pakistans”

Filed under: Data Analysis,Demographics,India,Pakistan,Population — Razib Khan @ 9:55 pm

Over at Econlog Bryan Caplan bets that India’s fertility will be sup-replacement within 20 years. My first inclination was to think that this was a totally easy call for Caplan to make. After all, much of southern India, and the northwest, is already sup-replacement. And then I realized that heterogeneity is a major issue. This is a big problem I see with political and social analysis. Large nations are social aggregations that are not always comparable to smaller nations (e.g., “Sweden has such incredible social metrics compared to the United States”; the appropriate analogy is the European Union as a whole).

So, for example, India obviously went ahead with its demographic transition earlier than Pakistan. But what this masks is that the two largest states in terms of population in India, in the far north, actually resemble Pakistan in demographics, not the rest of India. Uttar Pradesh, with a population 20 million larger than Pakistan, has similar fertility rate as India’s western neighbor. Bihar currently has a slightly higher fertility rate than Pakistan when you look at online sources (though the proportion under 25 is a little lower, indicating that its fertility 10-15 years ago was lower than Pakistan’s, ...

The future of the three “Pakistans”

Filed under: Data Analysis,Demographics,India,Pakistan,Population — Razib Khan @ 9:55 pm

Over at Econlog Bryan Caplan bets that India’s fertility will be sup-replacement within 20 years. My first inclination was to think that this was a totally easy call for Caplan to make. After all, much of southern India, and the northwest, is already sup-replacement. And then I realized that heterogeneity is a major issue. This is a big problem I see with political and social analysis. Large nations are social aggregations that are not always comparable to smaller nations (e.g., “Sweden has such incredible social metrics compared to the United States”; the appropriate analogy is the European Union as a whole).

So, for example, India obviously went ahead with its demographic transition earlier than Pakistan. But what this masks is that the two largest states in terms of population in India, in the far north, actually resemble Pakistan in demographics, not the rest of India. Uttar Pradesh, with a population 20 million larger than Pakistan, has similar fertility rate as India’s western neighbor. Bihar currently has a slightly higher fertility rate than Pakistan when you look at online sources (though the proportion under 25 is a little lower, indicating that its fertility 10-15 years ago was lower than Pakistan’s, ...

August 15, 2012

Who shall inherit the earth?

Filed under: Data Analysis,Demographics,Fertility — Razib Khan @ 9:44 pm

There was a question below in regards to the high fertility of some extreme (“ultra”) religious groups, in particular Haredi Jews. The commenter correctly points out that these Jews utilize the Western welfare system to support large families. This is not limited to just Haredi Jews. The reason Somalis and Arabs have fertility ~3.5 in Helsinki, as opposed to ~1.5 as is the norm, is in part to due to the combination of pro-natalist subcultural norms, and a generous benefits state. Of course we mustn’t overemphasize economics. Israel’s decline in Arab Muslim fertility but rise in Jewish fertility in the 2000s has been hypothesized to be due to different responses to reductions in child subsidies by Muslims and the Haredi Jews. In short, the former reacted much more strongly to economic disincentives in relation to the latter.

A bigger question is whether exponential growth driven by ideology can continue indefinitely. I doubt it. Demographics is inevitable, but subject to a lot of qualifications. Haredi political power in Israel grants some benefits, but at the end of the day basic economics will serve as a check on the growth of the population of this sector. Similarly, barring ...

July 17, 2012

Women wanted more children in 2000s, but had fewer

Filed under: data,Demographics — Razib Khan @ 10:04 pm

As someone with mild concerns about dysgenic (albeit, with a normative lens that high intelligence and good looks are positive heritable traits) trends, I’m quite heartened that Marissa Mayer is pregnant. Of course she’s batting well below the average of some of her sisters, but you take what you can get in the game of social statistics. Quality over quantitative thanks to assortative mating.

This brings me to a follow up of my post from yesterday, People wanted more children in 2000s, but had fewer. A reader was curious about limiting the data set to females. Therefore, I did. The same general pattern seems to apply (the limitations/constraints were the same). The only thing I’ll note is that there were only ~40 women in the data set with graduate degrees in the 1970s who were also asked these particular questions, so take this with a grain of salt.

Realized 1970s 1980s 1990s 2000s < HS 2.73 3.19 3.02 2.79 HS 2.67 2.91 2.59 2.22 Junior College 3 2.75 2.38 2.06 Bachelor 2.31 2.47 2.11 1.71 Graduate 2.11 2.07 1.89 1.56 < $20 K 2.52 2.89 2.57 2.23 $20-40 K 2.57 2.9 2.46 2.02 $40-80 K 2.91 2.95 2.49 1.99 > $80 K 3.08 2.86 2.35 1.95 Ideal 1970s 1980s 1990s 2000s < HS 3.08 2.96 2.73 2.85 HS 3.04 2.89 2.61 2.97 Junior College 2.58 2.8 2.95 3.31 Bachelor 3.01 2.95 2.86 3.15 Graduate 2.73 2.52 3.63 3.02 < $20 K 3 2.84 2.79 3.04 $20-40 K 3.04 3.01 2.69 2.96 $40-80 K 3.06 2.83 2.89 3.06 > $80 K 3.13 2.87 2.84 3.06


Addendum: ...

Identity by descent & the Völkerwanderung

Filed under: Demographics,Europe,Europe history,History,Völkerwanderung — Razib Khan @ 8:07 pm

Early this year I received an email from Dr. Peter Ralph, inquiring if I might discuss some interesting statistical genetic results from analyses of the POPRES data set which might have historical relevance. I’ve been excitingly waiting for the preprint to be made public so it could trigger some wider discussion. I believe that the methods outlined in the paper perhaps show us a path into the near future, where we might gain a much sharper perspective upon the recent past. So it’s finally out, and you can read it in full. Ralph and Dr. Graham Coop have posted put it up at arXiv, The geography of recent genetic ancestry across Europe. The paper uses ~500,000 SNPs from the POPRES data set individuals, and looks at patterns of identity by descent as a function of geography. By identity by descent, we’re talking about segments of the genome which are derived from a common ancestor. Because of recombination the length of the segments can give us a sense of the date of the last common ancestor; long segments indicate more recent ancestry because fewer recombination events have ...

July 16, 2012

People wanted more children in 2000s, but had fewer

Filed under: Data Analysis,Demographics — Razib Khan @ 6:19 pm

The readers of this weblog are relatively non-fecund, at least going by reader surveys. But I was curious nonetheless about the attitudes toward number of children, and realized goals of number of children, in the General Social Survey. I decided to look at two variables:



The former asks the respondent how many children they had, the latter how many they’d like to have. I restricted the sample to whites ages 45-65 for every survey year. I then combined all the years of a particular decade, so you have 1970s, 1980s, 1990s, and 2000s. For demographics I looked at highest educational attainment, and household income indexed to 1986 real value dollars (so they are comparable across decades).

Two major takeaways:

1) Education matters more than income in terms of number of children. Having lots of education tends to reduce family size. No great surprise.

2) Ideal number of children increased in the 2000s, but the decline in average number of children continued.

There is often talk in the literature on the disjunction between ideal family size in Third World nations and the realized family size, with a larger number of children than women may want. What is less discussed is the inverse discussion. It seems that ...

May 20, 2012

Education encourages integration?

Filed under: data,Demographics,race — Razib Khan @ 10:36 am

It is sometimes fashionable to assert that higher socioeconomic status whites are the sort who will impose integration on lower socioeconomic status whites, all the while sequestering themselves away. I assumed this was a rough reflection of reality. But after looking at the General Social Survey I am not sure that this chestnut of cynical wisdom has a basis in fact. Below are the proportions of non-Hispanic whites who have had a black friend or acquaintance over for dinner recently by educational attainment:

35% – Less than high school
36% – High school
47% – Junior College
45% – Bachelor
59% – Graduate

I thought this might have been a fluke, so I played around with the GSS’s multiple regression feature, using a logistic model. To my surprise socioeconomic status was positively associated with having a black person over for dinner, and age negatively associated. These two variables in fact tended to exhibit equal magnitude values in opposition, and always remained statistically significant. Just to clear, I created a variable Non-South vs. South below (being Southern increases likelihood of having had a black person over for dinner). All the individuals surveyed are non-Hispanic whites for the year 2000 and ...

April 13, 2012

Verbal intelligence by demographic

Filed under: Data Analysis,Demographics,GSS,Intelligence,WORDSUM — Razib Khan @ 7:43 pm

A few years ago I put up a post, WORDSUM & IQ & the correlation, as a “reference” post. Basically if anyone objected to using WORDSUM, a variable in the General Social Survey, then I would point to that post and observe that the correlation between WORDSUM and general intelligence is 0.71. That makes sense, since WORDSUM is a vocabulary test, and verbal fluency is well correlated with intelligence.

But I realized over the years I’ve posted many posts using the GSS and WORDSUM, but never explicitly laid out the distribution of WORDSUM scores, which range from 0 (0 out of 10) to 10 (10 out of 10). I’ve used categories like “stupid, interval 0-4,” but often only mentioned the percentiles in the comments after prompting from a reader. This post is to fix that problem forever, and will serve as a reference for the future.

First, please keep in mind that I limited the sample to the year 2000 and later. The N is ~7,000, but far lower for some of variables crossed. Therefore, I invite you to replicate my results. After the charts I will list all the variables, so if you care you should be able to ...

March 26, 2012

How income, class, religion, etc. relate to political party

Filed under: data,Data Analysis,Demographics,GSS,Politics — Razib Khan @ 9:11 pm

Update: There was a major coding error. I’ve rerun the analysis. No qualitative change.

As is often the case a 10 minute post using the General Social Survey is getting a lot of attention. Apparently circa 1997 web interfaces are so intimidating to people that extracting a little data goes a long way. Instead of talking and commenting I thought as an exercise I would go further, and also be precise about my methodology so that people could replicate it (hint: this is a chance for readers to follow up and figure something out on their own, instead of tossing out an opinion I don’t care about).


Just like below I limited the sample to non-Hispanic whites after the year 2000. Here’s how I did it: YEAR(2000-*), RACE(1), HISPANIC(1)

Next I want to compare income, with 1986 values as a base, with party identification. To increase sample sizes I combined all Democrats and Republicans into one class; the social science points to the reality that the vast majority of independents who “lean” in one direction are actually usually reliable voters for that party. So I feel no guilt about this. I suppose Americans simply like the conceit of being independent? I know I do. ...

March 25, 2012

The upper class is more Republican

Filed under: data,Data Analysis,Demographics — Razib Khan @ 2:31 pm

A few months ago I listened to Frank Newport of Gallup tell Kai Ryssdal of Marketplace that upper class Americans tend to be Democrats. Ryssdal was skeptical, but Newport reiterated himself, and explained that’s just how the numbers shook out. This is important because Newport shows up every now and then to offer up numbers from Gallup to get a pulse of the American nation.

Frankly, Newport was just full of crap. I understand that Thomas Frank wrote an impressionistic book which is highly influential, What’s the Matter with Kansas, while more recently Charles Murray has come out with the argument in Coming Apart that the elites tend toward social liberalism. I’m of the opinion that Frank is just wrong on the face of it, but that’s OK because he’s an impressionistic journalist, and I don’t expect much from that set beyond what I might expect from a sports columnist for ESPN. Murray presents a somewhat different case, as outlined by Andrew Gelman, in that his “upper class” is modulated in a particular manner so as to fall within the purview of his framework. Neither of these qualifications apply to Frank Newport, who is purportedly presenting straightforward unadorned data.

When the “average person on the street” thinks upper class they think first and foremost money. This is not all they think about, but in the rank order of criteria this is certainly first on the list. We can argue till the cows come home as to whether a wealthy small business owner in Iowa who is a college drop out is more or less elite than a college professor in New York City who is bringing home a modest upper middle class income (very modest adjusting for cost of living). But to a first approximation when we look at aggregates we had better look at the bottom line of money. After that we can talk details. And the first approximation is incredibly easy to ascertain. Below is a table and chart which illustrate the proportion of non-Hispanic whites after 2000 who align with a particular party as a function of family income, with family income being indexed to a 1986 value (so presumably $80,000 hear means what $80,000 would buy in 1986, not the aughts).


Family Income Strong Dem Dem Lean Dem Ind Lean Rep Rep Strong Rep
Less than $20,000 12 15 12 24 9 15 12
$20-$40,000 12 15 10 18 11 19 15
$40-$80,000 11 14 10 13 11 24 18
More than $80,000 12 12 10 11 11 23 21

The results are straightforward: the more income a family has, the more likely they are to be Republican. There is a lot of nuance and geographical detail to be fleshed out in these results. But these facts are where we need to start.

Andrew Gelman has much more as usual. For example, this chart:



Why do I keep posting this stuff? Because facts matter. That’s my hope, my faith. Tell people facts, and they will open their eyes. Tell your friends, tell your family. Have whatever opinion you want to have, but start with the facts we know. Look up facts, calculate facts, analyze facts. They are there for us, we just need to go look. Google is your friend, Wikipedia is your friend. The General Social Survey is your friend.

The revival of the American city?

Filed under: Demographics,Urbanism — Razib Khan @ 1:37 pm

I’ve never watched Mad Men, but I really can’t help but hear all about the show. One thing that has struck me about the change from then, ~1960, to now, ~2010, is the alignment of quantitative demographic trends with impressionistic cultural ones. The 1970s were a disaster for the old urban order. Below are the top 10 cities by population in 1960 and 2010.

Rank 1960 2010
1 New York New York
2 Chicago Los Angeles
3 Los Angeles Chicago
4 Philadelphia Houston
5 Detroit Philadelphia
6 Baltimore Phoenix
7 Houston San Antonio
8 Cleveland San Diego
9 Washington Dallas
10 St. Louis San Jose

The rise of the “Sun Belt”, housing bubble notwithstanding, is a real and awesome phenomenon. Below the fold I’ve taken some demographic trend data for the top 10 cities of 1960. The first two panels show raw population data. The second two panels show the decade-to-decade change in population in terms of multiples (i.e., 1.2 for 2010 means that the population in 2010 was 1.2 times that in 2000).


For me the biggest surprise is how much the trajectory of Chicago resembles stereotypical “Rust Belt” cities. Unlike New York City Chicago lost population in the aughts. In some ways New York City is sui generis. I went through the precipitous near collapse in the 1970s, just as the smaller cities of the Heartland, but over the past few decades it has refashioned itself, exhibiting a demographic vigor to match Los Angles on the West coast. A second surprise is Philadelphia’s robustness. Unlike the Midwestern cities it seems to have developed some “stabilizers.”

More starkly, observe the rate of change in the 1970s. We often reflect upon the cultural shifts in the 1960s, the first half of which are arguably part of the long 1950s. But chaos of the late 1960s bore fruit over the 1970s, and echoed down into the 1980s. Though the worst of the decline was over by the 1980s, a pall of decline still hung over much of the decade (e.g., “Japan Inc.”) due to the experiences of the 1970s, Ronald Reagan’s robust counter-narrative notwithstanding. And the great urban revival of the 1990s clearly leveled off in the last decade.

Source data.

January 7, 2012

How many minorities are there in the USA?

Filed under: Data Analysis,Demographics,Minorities — Razib Khan @ 12:56 pm

Prompted by Andrea Mitchell’s complaint that Iowa is not representative of America in racial terms the Audacious Epigone probed an American state’s typicality in terms of racial demographics, using the overall American population as a measure. One of the major issues with judging the typicality of a given state is that there is a great deal of residential segregation in even “diverse” regions. This comes up in our personal choices too. In 2008 ~10 percent of non-Hispanic whites married someone who was not a non-Hispanic white. Obviously more than ~10 percent of the population, particularly in the prime marrying demographic, are non-Hispanic whites, so you’re seeing a fair amount of homogamy. In some ways the homogamy is even more striking for minorities. ~31 percent of Asian Americans in this period married a non-Asian American. But, one has to keep in mind that using the American population as representative over 90 percent of the potential marriage partners are not Asian American!

The quest for a state that “looks like America” is understandable, but the reality of lived life is more complex. And not just in racial terms (e.g., the division in politics between the white suburbs of Maryland vs. Virginia on either side of D.C.). But keeping race in mind, one consistent finding in social science is that Americans actually tend to overestimate the number of minorities. Iowa is actually more typical than we think, despite the fact that it is not typical. In the year 2000 the General Social Survey asked respondents to estimate the number of various groups in the USA. The finding of a tendency to overestimate minorities, and underestimate non-Hispanic whites, was confirmed. But, I decided to break this down by demographic. The results are below in a table.

The first row are real counts from the 2000 Census. All the following rows are average estimates of a set of respondents in the year 2000.

Results White Black Hispanic Asian Jews Δ whites Δ minorities
Real Value 2000 Census 69.1 12.3 12.5 3.6 3.0
Total Sample 59.0 31.3 24.6 17.7 17.7 -10.1 15.1
Whites 59.2 29.8 22.8 16.0 16.5 -9.9 13.4
Black 57.4 38.8 27.4 21.7 23.3 -11.7 19.8
Hispanic 58.2 35.3 38.2 27.5 21.8 -10.9 24.2
Liberal 58.6 30.4 24.7 17.7 17.3 -10.5 14.8
Moderate 58.3 32.9 25.1 18.6 18.3 -10.8 16.1
Conservative 60.2 29.3 23.5 16.5 16.8 -8.9 13.6
Under 35 56.8 32.5 25.0 18.3 16.6 -12.3 15.8
35 to 64 59.7 30.5 24.3 17.1 17.8 -9.4 14.5
Over 65 61.9 31.2 24.3 18.3 20.1 -7.2 15.1
Northeast 58.5 31.3 25.0 18.9 20.4 -10.6 15.6
Midwest 58.2 31.4 23.5 17.5 17.4 -10.9 14.7
South 58.4 33.0 23.3 15.6 16.4 -10.7 14.5
West 61.3 28.7 27.1 20.1 17.6 -7.8 15.8
Men 59.2 27.3 19.5 13.3 13.9 -9.9 10.6
Women 58.8 34.7 28.8 21.5 21.0 -10.3 18.9
No College 58.4 33.4 25.9 19.1 19.1 -10.7 16.7
College 61.5 24.3 20.2 13.2 13.4 -7.6 9.8
Protestant 58.0 32.6 23.1 17.5 17.9 -11.1 14.9
Catholic 60.4 30.8 27.4 19.4 19.6 -8.7 16.4
Jewish 62.5 25.2 23.4 16.1 10.1 -6.6 12.1
No Religion 59.0 29.4 24.4 16.1 16.1 -10.1 13.8
Favor ban interracial marriage 59.3 37.7 27.7 21.0 20.7 -9.8 19.3
Against ban interracial marriage 58.6 30.3 24.4 17.8 17.5 -10.5 14.7

As you can see when you add up the elements on the row margins you get more than 100 percent. Why? Because I’m averaging the responses of individuals, and they aren’t talking to each other and figuring that you can’t get more than 100 percent as a collective whole. Across the demographics there is an average underestimate in absolute values of non-Hispanic whites by 10 percentage points, and an overestimate of minorities (excluding Jews here) of about 15 percentage points. The differences from the real value though were consistent with the relationships of the real values. The correlations were almost around around 0.98, which means that rarely did you come out with a scenario where a demographic estimated 5 percent for blacks, and 25 percent for Hispanics. Rather, there was a consistent overestimate of minorities, and underestimate of whites.

Taking into account both over and under estimations it looks like those with at least college educations do the best (the number of Jews in the sample was rather small, so take it with a grain of salt). But even here there is a skew in numbers. Why does this exist? My own initial hunch is that the national media is very unrepresentative of America. Set in New York or Los Angeles they reflect the demographics of those regions. Or do they? I haven’t watched much TV in a while, but I do recall the amusing reality that Seinfeld and Friends were set in New York, but minorities were very much token characters in the majority-minority city where the protagonists were resident. But, to be fair I think that may be a real reflection of lives lived, where different races and ethnic groups simply socialize with their “own” (in Manhattan the Upper East Side is ~85 percent non-Hispanic white, on an island that is ~50 percent non-Hispanic white, in a city that is ~33 percent non-Hispanic white).

December 29, 2011

Vocabulary score by race, ethnicity, and region

Filed under: Data Analysis,Demographics,GSS,WORDSUM — Razib Khan @ 10:22 pm

Mike the Mad Biologist has a post up, A Modest Proposal: Alabama Whites Are Genetically Inferior to Massachusetts Whites (FOR REALZ!). The post is obviously tongue-in-cheek, but it’s actually an interesting question: what’s the difference between whites in various regions of the United States? I’ve looked at this before, but I thought I’d revisit it for new readers.

First, I use the General Social Survey. Second, I use the WORDSUM variable, a 10 question vocabulary test which has a correlation of 0.70 with general intelligence. My curiosity is about differences across white ethnic groups by region. To do this I use the ETHNIC variable, which asks respondents where their ancestors came from by nation. I omitted some nations because of small sample size, and amalgamated others.

Here are my amalgamations:

German = Austria, Germany, Switzerland

French = French Canada, France

Eastern Europe = Lithuania, Poland, Hungary, Yugoslavia, Russia, Czechaslovakia (many were asked before 1992), Romania

Scandinavian = Denmark, Norway, Sweden, Finland (yes, I know that Finland is not part of Scandinavia, Jaakkeli!)

British = England, Wales, Scotland

Next we need to break it down by region. The REGION variable uses the Census divisions. You can see them to the left. I combined a few of these to create the following classes:

Northeast = New England, Middle Atlantic

Midwest = E North Central, W North Central

South = W S Central, E S Central, South Atlantic

West = Pacific, Mountain

The key method I used is to look for mean vocabulary test scores by ethnicity and religion. I also later broke down some of these ethnic groups by religion. Finally, all bar plots have 95 percent confidence intervals. This should give you a sense of the sample sizes for each combination.

First let’s break it down by race/ethnicity and compare it by region to get a reference:

Next, the main course:

Finally, let’s separate by religion for Germans and Eastern Europeans:

I include the last plot because these reports of nationality have to be taken with a consideration for the structure they may mask. People whose ancestors from Poland in the United States fall into two large categories: people of Jewish heritage whose identity as ethnic Poles was contested (recall that Jews often spoke Yiddish as their first language, a Germanic language), and Roman Catholic Slavs. I suspect many of those in the “None” category are also Jews by culture, if not religion.

Second: there is a tendency of people of all ethnic groups to have lower vocabulary scores if they are from the South or Midwest. This tendency is in many cases outside of the 95 percent confidence interval. It’s especially striking in the three groups with huge samples sizes in all regions: Germans, Irish, and British. Irish here includes both Scots-Irish and those of Irish Catholic background. Not only are the sample sizes for these groups large, but the roots of these groups in some of these regions go rather far back. In particular, the division between the people of British ancestry goes back centuries in the North vs. South divide.

How to understand this? There are a lot of complicating factors.  But as outlined in Albion’s Seed and The Cousins’ Wars the divisions between the Anglo-Celtic folkways runs deep and long. If a time traveler from the 18th century arrived in the United States today and were asked which region was the heart of intellectual ferment they would correctly guess New England. Early Puritan New England was the first universal-literacy society in the world. This was to some extent a matter of conscious planning. The leaders of the New England colonies enforced limitations upon who could emigrate to their dominion. Religious exclusions and persecutions in this region are well known, but there was also a policy of rejecting the settlement of those who were perceived to be possible burdens upon the community. New England then selected for a middle class migration out of East Anglia and the port towns of southwest England. But the fathers of the early colony also rejected the transfer of the privileges of the blood nobility from the motherland, thereby throwing up a barrier to the migration of the aristocracy.

In contrast the lowland South received a more representative selection of the British class strata. The younger sons of the British nobility and self-styled gentlemen arrived to make their mark, as did those who became indentured servants and even slaves. A class society on the model of southwestern England recapitulated itself in this region. As for the uplands, what became Appalachia, an influx of Scots-Irish came to dominate the scene by the mid of the 18th century, disembarking in Philadelphia, and pushing down the spine of the high country down to the Deep South.

Conflicts between these “Anglo” groups framed the terms of debate over the 18th and 19th centuries. They were to some extent at the root of the Age of Sectionalism. Today because of the salience of race, and the prominence of the later wave of migration in the late 19th and early 20th century which remained vibrant in living memory for mod, these early divisions have moved out of sight. But they still remain. The difference between Germans in Texas and the Anglos of Southern extraction remains to this day, but note that Germans exhibit the same regional differences in vocabulary score as Anglos. Why? This may be a case where the original cultural substratum has an outsized impact (the dialect of eastern New England, made famous by the Catholic Irish of Boston, is descended from East Anglian English!).

Of course there might be a genetic difference. Intelligence is a quantitative trait, so it would be trivial to generate two populations which are genetically similar, but very different in trait value, simply through selection. In the 1630s ~20 thousands Puritans settled New England. For various reasons there was very little migration over the next century and a half. By 1780 New England’s population was 700,000, almost all through natural increase (not only was New England the world’s first universal literacy society, but its fertility was the highest in the late 17th century).

Finally, there’s the issue of disease and pathogen load. Endemic hookworm infection does seem likely to have made Southerners, of both races, relatively indolent and lethargic in comparison to Northerners. Who knows what pathogens simply fall below our radar?

Overall I think that a more fine-grained and detailed exploration of these topics is warranted. Our public discussion is too coarse, and data-thin.

December 27, 2011

Would you have your fetus genetically tested?

Filed under: Demographics,Genetic Testing,Genetics,GSS,Personal genomics — Razib Khan @ 12:22 pm

There’s a variable in the GSS, GENESELF, which asks:

Today, tests are being developed that make it possible to detect serious genetic defects before a baby is born. But so far, it is impossible either to treat or to correct most of them. If (you/your partner) were pregnant, would you want (her) to have a test to find out if the baby has any serious genetic defects?

This is relevant today especially. First, the technology is getting better and better. Second, couples are waiting longer to start families. Unfortunately this question was only asked in 1990, 1996, and 2004. But on the positive side the sample sizes were large.

I decided to combine 1990 and 1996 into one class. Also, I combined those who were very liberal with liberals, and did the same for conservatives. For political party ideology I lumped strong to weak identifiers. For intelligence I used WORDSUM. 0-4 were “dull,” 5-7 “average,” and 8-10 “smart.” For some variables there weren’t results for the 1990s.

The biggest surprise for me is that there wasn’t much difference between the 1990s and 2004. The second biggest surprise was that the differences between demographics were somewhat smaller than I’d expected, and often nonexistent. Below is a barplot and table with the results.


Yes to fetal genetic tests by demographic
Demographic 1990s 2004
Male 69 67
Female 68 65
White 67 65
Black 79 72
Hispanic * 71
Less than HS 72 67
High School 68 65
Junior College 64 69
Bachelor 71 65
Graduate 69 71
Protestant 69 65
Catholic 62 63
Jewish 95 78
No Religion 78 69
Dull 74 71
Average 65 65
Smart 71 66
Liberal 80 77
Slight Liberal 70 64
Moderate 70 69
Slight Conservative 68 66
Conservative 59 52
Democrat 73 74
Independent 69 67
Republican 65 58
Yes to abortion on demand 80 75
No to abortion on demand 61 56
Bible Word of God 63 61
Bible Inspired Word 68 65
Bible Book of Fables 82 75
Evolution definitely true * 81
Evolution probably true * 70
Evolution probably not true * 67
Evolution definitely not true * 58





December 4, 2011

People respond to incentives

Filed under: Demographics — Razib Khan @ 9:28 am

Fascinating story about the re-identification of people of Eurasian ancestry as white to get into elite universities. Some Asians’ college strategy: Don’t check ‘Asian’:

Lanya Olmstead was born in Florida to a mother who immigrated from Taiwan and an American father of Norwegian ancestry. Ethnically, she considers herself half Taiwanese and half Norwegian. But when applying to Harvard, Olmstead checked only one box for her race: white.

Asian students have higher average SAT scores than any other group, including whites. A study by Princeton sociologist Thomas Espenshade examined applicants to top colleges from 1997, when the maximum SAT score was 1600 (today it’s 2400). Espenshade found that Asian-Americans needed a 1550 SAT to have an equal chance of getting into an elite college as white students with a 1410 or black students with an 1100.

In the article Steve Hsu observes that the Ivy League universities have a suspiciously similar proportion of Asians, about 2/3 of the fraction of a “race blind” admissions college like Cal Tech. Here’s Alex Tabarrok with the numbers: “At Yale the class of 2013 is 15.5 percent Asian-American, at Dartmouth 16.1 percent, at Harvard 19.1 percent, and at Princeton 17.6 percent.” I assume that the “Asian Quota” will start to change as the current generation of Asian American students become established as alumni donors.

I’m not a big fan of the “Asian Quota” personally. But, I do think one can make a case for it based on the fact that children from families with an Asian background have a strong bias toward optimizing measured outcomes. But, this entails making a profile, or “stereotype,” of a population. I’m not someone who actually objects to this on principle, but I find the hypocrisy on this issue rather annoying, because the same administrators who would decry stereotypes feel they have to employ them implicitly for practical (so the alumni don’t see their university overwhelmed by “yellow hordes,” and so reduce giving) and idealistic reasons (to maintain some ethnic balance).

COMMENTS NOTE: Any comment which misrepresents the material in this post will result in banning without warning. So you should probably stick to direct quotes in lieu of reformulations of what you perceive to be my intent in your own words. For example, if you start a sentence with “so what you’re trying to say….”, you’re probably going to get banned. I said what I tried or wanted to say in the post. Period.

November 1, 2011

A game of numbers, a matter of values

Filed under: Demographics,Population Growth — Razib Khan @ 12:04 am

The New York Times has a article out about environmentalists who are now looking at population control again, after shying away from it. This is probably prompted by the hullabaloo over “7 billion.” This comes in the wake of a long piece, The Last Taboo, in the Lefty periodical Mother Jones.

The rationale for why environmentalists have moved away from population control is alluded to only elliptically in The New York Times piece. They make a big deal about abortion, but I don’t think this is the most terrifying issue in principle. Environmentalists tend to be on the pro-choice side of the Culture Wars anyway. To cut to the chase it is the fear of being called racist (and to be fair, racial nationalists from Madison Grant to John Tanton have synthesized ethnic concerns with genuine conservationist impulses). Only environmentalists with rock-solid credentials or a lean toward anti-humanistic Deep Ecology philosophies remained vocal about their opposition to mass immigration over the past few decades. David Brower, founder of the Sierra Club, was one such individual. And it’s no surprise that the founder of the radical Earth First movement is also an immigration restrictionist.

The logic behind environmentalist skepticism of immigration is pretty clear. Citizens of the developed world have a huge impact in comparisons to citizens of the developing world. Without immigration since 1965 the population of the United States would have already stabilized a generation ago, while today it will likely approach 400 million in the mid-21st century.

I’m much more optimistic about the medium term future than most environmentalists. Though I’m not a Panglossian, I think that science and technology will probably be able to manage to keep civilization creaking along. And when it comes to population control sometimes I wonder if the ultimate reason why we care about population control isn’t being muddled. Those who espouse a full-throated Deep Ecology ethos which is basically anti-humanistic in orientation are at least honest. People who are militant about not having children, and attempt to convert others to the cause, sometimes strike me as curious. Who exactly are they saving the world for? The people who they claim should not reproduce?

But I’m a biologist enough to understand that Malthusian conditions aren’t made up out of whole cloth. I understand the fixation upon controlling the numbers of middle class Westerners in the medium term, but observe in the plot above that the crappiest countries in the world have the highest fertilities! One thing that seems true is that a demographic transition results in a positive shift in the dependency ratio, so that economic growth and higher quality of life ensues. Nations which aren’t proceeding through the demographic transition don’t benefit from this dividend. To not put a too fine a point on it they remain shitholes for their residents, and require the resources and energy of societies which actually function to prop up. The “carbon footprint” of a Somali really isn’t a big deal. 5 million Somalis vs. 10 million Somalis makes no real difference to the planet. But it makes a huge difference to the probability of a given Somali starving or not! If you want to go Julian Simon on me I have a bet I could make with you about the relationship between Somalia’s population and its stability and per capita prosperity (at least normalized for world levels of prosperity).

The game of understanding, and shaping, human population seems to be forced into two artificial extremes. On the one hand there are absolutists for reproductive freedom who believe that to have children is a right, who also believe that food, healthcare, and housing are rights. What are rights without responsibilities? There is no honor is starving to death because your nation is too dangerous for CNN camera crews to come film large numbers of children and infants with bloated bellies, which might prompt those societies with surplus to divert it so you can live to breathe and breed another day. As for environmentalists who scold others for daring to produce more humans, often in the zeal of their proselytization they can confuse others into wondering if a world extirpated of humanity wouldn’t be their ideal. This is not the truth of the matter, after all ZPG activists aim for stabilization in large part so that the affluence and security which we take for granted might become sustainable, extending human well being and flourishing out indefinitely. Instead of one-size-fits-all maxims, what we need are case-by-case solutions for a complex world.

October 20, 2011

Which Hispanics identify as white?

Filed under: Data Analysis,Demographics,Hispanics,Latinos — Razib Khan @ 12:31 am


I wanted to clarify a few issues with the Census’ American Community Survey. These data come from the interval of 2006-2008, and they allowed me to query the proportional of various Latino/Hispanic groups who identified as white. I knew in the aggregate that the majority of America’s Latinos identified as white, but I was curious about two things:

1) The variation in white identification by group (by national origin)

2) The variation in white identification of Mexican Americans by selected states

Results below. There are stories in these data….

White Black Other race White and black White & Native American
300: Cuban 87.3 3.5 8.6 0.5 0.1
420: Argentinean 86.3 0.3 13.2 0.1 0.2
450: Spaniard 80.5 1.3 15.5 0.6 2.1
427: Uruguayan 79.3 0.1 20.3 0.3 0
422: Chilean 77.3 0.6 21.5 0.2 0.4
428: Venezuelan 77.1 2 19.8 0.9 0.3
423: Colombian 69.7 1.7 27.7 0.7 0.2
425: Paraguayan 68.4 0.1 31.1 0 0.4
414: Nicaraguan 68 2 29.3 0.6 0.2
411: Costa Rican 65.4 5.7 28.1 0.5 0.2
421: Bolivian 63.4 0.5 35.3 0.3 0.4
100: Mexican 59.4 0.6 39.3 0.2 0.5
426: Peruvian 58.3 0.6 40.1 0.4 0.6
424: Ecuadorian 55.5 0.8 43.2 0.3 0.2
413: Honduran 53.8 4.2 41.1 0.6 0.2
200: Puerto Rican 53.1 6.1 39.1 1.5 0.2
412: Guatemalan 49.3 1 48.9 0.4 0.5
416: Salvadoran 48.9 0.8 49.4 0.7 0.2
415: Panamanian 41.3 27.6 28.3 2.5 0.4
460: Dominican 29.1 9.2 59.6 2 0.1
Mexican Americans by state
White Black Other race White and black White & Native American
63: Idaho 74.7 0.2 23.1 0.2 1.8
66: New Mexico 72.2 0.5 25.9 0.2 1.2
67: Utah 71.7 0.8 26.5 0.4 0.6
65: Nevada 71.2 0.6 27.5 0.5 0.3
68: Wyoming 70.2 0.5 25.6 0.3 3.4
49: Texas 68.6 0.4 30.7 0.2 0.2
62: Colorado 68.4 0.6 30 0.2 0.7
61: Arizona 67 0.4 32 0.2 0.4
72: Oregon 64 0.7 33.5 0.5 1.4
64: Montana 59.5 4.8 31.7 0.2 3.9
71: California 53.8 0.4 45.3 0.2 0.4
73: Washington 51 0.6 47.4 0.4 0.6
21: Illinois 45.2 0.5 53.8 0.2 0.3
12: New Jersey 43.1 1.3 54.5 0.6 0.4
13: New York 41.3 2.1 55.5 0.7 0.3

August 31, 2011

Don’t count old stock Anglo-America out

Filed under: Data Analysis,Demographics — Razib Khan @ 2:54 pm
One of the things I really hate are unqualified linear projections. They’re so useless most of the time. A science fiction magazine will give you more insight about the future than the United Nations population projection for the year 2100. This is just as much of an issue when it comes to American Census demographic [...]
Older Posts »

Powered by WordPress