Razib Khan One-stop-shopping for all of my content

March 14, 2011

Analyzing ancestry with ADMIXTURE, step by step

Over the past few months I was hoping more people would start doing what Zack Ajmal, Dienekes, and David, have been doing. There are public data sets, and open source software, so that anyone with nerdy inclination can explore their own questions out of curiosity. That way you can see the power and the limitations of  genomics on your own desktop. I wonder if one of the biggest reasons that more people haven’t started doing this is formatting. It can be a pain to convert matrix formatted files into pedigree format, for example. But the data gusher isn’t ending, look at what’s coming out (and has come out) in the 1000 Genomes project!

I’ve been thinking I need to write up a post which is a “soft landing” for people so that we can reduce the “activation energy” for this sort of thing…once you get hooked, you only go deeper. Luckily an anonymous tipster has sent me the link to a URL with a huge data set which has been merged, already pedigree formatted. Here are the populations:

!Kung Buryats Hausa Mada Punjabi Arain Totonac Adygei Cambodian Hazara Makrani Pygmy Tu African Americans Chinese Hema Malayan Romanians Tujia Algeria Chinese Americans Hezhen Mandenka Russian Tunisia Altaians Chukchis Hungarians Maya Sahara Occ Turks Alur Chuvashs Iban Mbuti Sakilli Tuscans Ap Brahmin Cochin Jews Igbo Melanesian Samaritians Tuvinians Ap Madiga Colombian Iranian Jews Mexicans Samoan Urkarah Ap Mala Cypriots Iranians Miao San Utahn Whites Armenians Dai Iraq Jews Mongola San Nb Uygur Armenians B Daur Irula Mongolians Sandawe Uzbekistan Jews Ashkenazy Jews Dogon Italian Moroccans Sardinian Uzbeks Azerbaijan Jews Dolgans Japanese Morocco Jews Saudis Vietnamese Balochi Druze Jordanians Morocco N Selkups Greenlanders Bambaran Greenlanders Kaba Morocco ...

Powered by WordPress