Sepia Mutiny » Science http://sepiamutiny.com/blog All that flavorful brownness in one savory packet Tue, 08 May 2012 05:38:42 +0000 en hourly 1 http://wordpress.org/?v=3.2.1 Yogurt: A Gut Feeling in the Mind http://sepiamutiny.com/blog/2011/09/19/yogurt-a-gut-feeling-in-the-mind/ http://sepiamutiny.com/blog/2011/09/19/yogurt-a-gut-feeling-in-the-mind/#comments Mon, 19 Sep 2011 07:08:36 +0000 Pavani http://sepiamutiny.com/blog/?p=6844 Continue reading ]]> When I was younger, yogurt repulsed me. This was no small thing because my parents come from southern India, where yogurt seems to serve as a sort of digestif without which meals don’t feel complete. There was always a pot of homemade yogurt in the fridge or on the kitchen table.

Family members would marvel (and sometimes take offense) that I wasn’t finishing up my meal with yogurt, mixing it up with rice or using it to temper the spicy foods or pickles. Imagine a grandma’s Ayurvedic admonitions in place of a Robert Mitchum voiceover and a symphony of joyful slurping instead of Copland’s “Hoe-down” and you’ll have an idea of what the Yogurt, It’s What You Eat After Dinner experience was like. Some of the reasons why I was supposed to eat it:

  1. It tastes good. They felt sorry for me that I was missing out on so much tart-y goodness.
  2. It had calcium and protein, both of which I needed to grow up strong and healthy.
  3. Something to do with eating compatible “hot” and “cold” foods. I know it has less to do with food temps than other characteristics, but don’t know enough.

Perhaps you can think of more reasons to add to the pro-yogurt chorus. A recent Wall Street Journal article suggests another possibility—that maybe probiotic bacteria, the microorganisms typically found in yogurt and dairy products and known to have benefits in the gut, also have positive effects on the brain, reducing psychological distress and increasing confidence. Tests with mice found that mice given Lactobacillus rhamnosus behaved more confidently and with less anxiety and had a more positive mood than those not given the bacteria. Read the article at the Journal and the Economist for more information about this research.

I can’t say that I like yogurt as much as one company seems to think women would (nor can I figure out what “zen wrapped in karma” might mean), but nowadays I do like yogurt very much, in pretty much all of its forms—plain, homemade, Greek, frozen, non-fat, 2%, full-fat, etc. Not sure how or when this change in attitude happened, but surely it must have involved a taste of perugu vada.

Possibly interesting: Why Indians and Europeans Tolerate Milk  

Image: Flickr photo from http://www.flickr.com/photos/johnnystiletto/

]]>
http://sepiamutiny.com/blog/2011/09/19/yogurt-a-gut-feeling-in-the-mind/feed/ 27
Jatts may indeed be Scythian http://sepiamutiny.com/blog/2011/07/13/jatts_may_indee/ http://sepiamutiny.com/blog/2011/07/13/jatts_may_indee/#comments Wed, 13 Jul 2011 06:59:40 +0000 Razib Khan http://sepiamutiny.com?p=6598 Continue reading ]]> In the comments on this weblog over the years I’ve learned a lot of interesting things about South Asian ethnography. One component which has been notable is the sense of ethnic pride of Punjabis, and in particular Jatts. Some of this is rather standard racism against other South Asians, especially South Indians and Bengalis in relation to whom they feel aesthetically superior. But other assertions of distinction are not so charged.

One of the aspects of Jatt identity seems to be the conception that they are descended from “Scythians,” what in a South Asian context would be termed Saka. When some Jatt commenters with whom I had amicable relationships with would bring this up I would gently mock them. My personal stance is that South Asians have an unhealthy obsession with presumed foreign origin, as if being South Asian is somehow shameful. This is very evident amongst Muslims for obvious reasons, insofar as Islam came to the subcontinent from West Asia. But I’ve encountered the same stance amongst Hindus. For example, Kashmiri Pandits explaining their peoples’ Persian origins.

But whatever the demerits of the excessive overall fixation on exogenous origin, I now believe that I wrongly dismissed out of hand the idea that Jatts in particular have some Scythian origin. The reason are a series of results coming out of the Harappa Ancestry Project. To be concise, it does seem that Jatts have a small but consistent proportion of northern Eurasian ancestry which sets them apart from other Punjabis. The most parsimonious explanation to my mind is that the Sakas did indeed have a genetic impact. This does not mean that I have a high confidence in this historical model. But I was clearly in the wrong in dismissing the Scythian origin myth out of hand. For that, I apologize. Also, please note that I am not claiming here that the preponderance of Jatt ancestry is Scythian. It is not. Rather, there may have been a Scythian overlay upon a typical Punjabi substrate.

If you are curious to learn more, please see the comments at the Harappa Ancestry Project.

]]>
http://sepiamutiny.com/blog/2011/07/13/jatts_may_indee/feed/ 0
The Pakistani genome http://sepiamutiny.com/blog/2011/07/01/the_pakistani_g/ http://sepiamutiny.com/blog/2011/07/01/the_pakistani_g/#comments Fri, 01 Jul 2011 21:49:34 +0000 Razib Khan http://sepiamutiny.com?p=6592 Continue reading ]]> We’re fast approaching the point where the “first genome” of class X is going to lose its novelty. There are more than 100 people who have had their full genome sequenced, and you can’t really track down a comprehensive list anymore that I can see. Remember, a full genome sequence is a mapping of all 3 billion DNA base pairs. In contrast, what genotyping services offer are a subset, often 1 million base pairs. The 1 million are not random, rather, they are variants which are known to…vary. But there are some important issues which can be addressed only in a full genome sequence. For example, you can see which distinct mutations are unique to you, and separate you from your parents.

In any case, here’s a summary in the Dawn:

The details were revealed to the Pakistani media by Prof. Dr. M. Iqbal Choudhary, Director International Centre for Chemical and Biological Sciences (ICCBS), Karachi University and Dr. Kamran Azim of ICCBS at a press conference at PCMD.

Highlighting the importance of the project, Dr. Choudhary said Pakistan had officially entered into the world of genome mapping and the details of the work would be published soon in a research journal. He disclosed that eminent Pakistani chemist and former chairman of the Higher Education Commission (HEC), Dr. Atta-ur-Rehman was the first Muslim and Pakistani whose complete genome was mapped by Dr. Kamran Azim.

“The important work will pave the way for research on heredity diseases, evolution and the over all genetic make up of Pakistanis which now hold a unique genetic pattern as a nation. In the past many people like Dr. Watson and others urged scientists not to reveal their genome publicly but Dr. Rehman has never put any restrictions for his genome draft,” Choudhary added.

As you might guess, I laud that they’re releasing this data to the public. I do find it rather weird that the Pakistani press is reporting that the first Muslim has been sequenced. Do we talk about the first Buddhist, Christian, or Hindu being sequenced? Muslims are not a clear and distinct class in relation to population biology, but I suppose for many believers this is the biggest point at issue when considering one’s self-identity.

Speaking of populations, The Express Tribune has a rather amusing short article which speaks volumes:

“Our nation is a mix of a lot of races,” said Prof. Dr M Iqbal Choudhary, who heads the project. “Pakistanis are like a “melting pot” ie a mix of Mughals, Turks, Pashtuns, Afghans, Arabs, etcetera.”

You can actually look at a lot of Pakistani genetics thanks to the HGDP data set. There are statistically significant contributions from Africans, West Asians, and even East Asians, to Pakistanis. But on average this is a very low load, less than ~5%. Pakistanis are what you’d expect, just part of the normal range of variation of South Asians. In some of the remarks and press there is the admission that Pakistanis aren’t really genetically discontinuous with Indians, but that isn’t overly emphasized (also, aside from the Baloch, it does look like Pakistanis are discontinuous with Iranians).

To make more concrete about what I’m talking about, let me show you a plot I generated a few days ago. Below are three populations, Iranians, Pakistani Pathans, and Gujarati Patels. Each bar represents an individual, and the color proportions represent ancestry mixes. I’ve labeled the colors for convenience, though they have only rough correspondence with the names I give them. You can find the full results here. The individual sequence above is reputedly a Mujahir, so I suspect they would be somewhere between the Pathans and Patels in their proportions. Note that I also added some friends & family whose samples I have at the right edge of the bar plot.

ancestry.gif

]]>
http://sepiamutiny.com/blog/2011/07/01/the_pakistani_g/feed/ 1
The Diaspora and human genetics http://sepiamutiny.com/blog/2011/07/01/the_diaspora_an/ http://sepiamutiny.com/blog/2011/07/01/the_diaspora_an/#comments Fri, 01 Jul 2011 06:50:56 +0000 Razib Khan http://sepiamutiny.com?p=6591 Continue reading ]]> Earlier this year I expressed excitement that the 1000 Genomes, “A Deep Catalog of Human Genetic Variation,” finally was going to add some more Indian populations. There was a sample of Gujaratis from Houston, but that’s a rather narrow slice of ~1 billion Indians, and nearly ~1.4 billion South Asians. The populations which were going to be added were Kayasthas from West Bengal, Marathas from Maharashtra, and Ahom from Assam.

Unfortunately, as I commented a few days ago that looks like it’s not happening. The Indian population collections have been removed from the website, and replaced by Sri Lankan Sinhalese and Tamils from the United Kingdom, and Bangladeshis. The Pakistani collection is already in process, as they’re getting the samples from Lahore.This is really sad. Apparently objections from the government of India and bureaucratic impasses made it so that the Human Genome Diversity Project had to use Pakistani populations as proxies for South Asians. This is acceptable, but the Pakistani populations are on the margin of the distribution of genetic variation in South Asian populations. Just like the Bangladeshi populations. This stands to reason, they’re marginally located geographically. The Marathas in particular would have been nice, since they’re probably much more South Asia typical. Typicality matters because South Asians have enough genetic diversity that it probably is something one should consider when controlling for population structure in medical genetics. For example, there is some data out of Britain that Bangladeshis have a higher risk for diabetes all factors controlled than Pakistanis. This may be due to cultural differences, or it may be due to genetics. Until you survey genetic variation within a set of populations you’ll never know which.

When I first began blogging about genetics here some commenters expressed frankly paranoid rantings about how the new genomics was going to enable a biological weapons program against India by the I.S.I. This is stupid. Pakistanis and Indians may differ, but they are rather similar, and there’s not much difference between ethnic Punjabis on either side of the border. But if you do have paranoid fantasies, don’t worry. It looks like if you want to get genetic information your best bet is to go to the non-Indian states of South Asia. By the end of year you’ll be able to download 100 full genome sequences of Pakistani Punjabis! I suppose that’s part of some nefarious plan….

In any case, on a positive note I don’t think that the Indian establishment’s intransigence on this issue matters. There are now millions of South Asians of various ethnicities across the world. The amateur Harappa Ancestry Project has over 100 genotypes all by itself. I suspect that the government of the United States or the United Kingdom could fund genomics projects which focus on various under-represented ethnicities in public databases due to the nature of politics abroad at some point in the near future. Full genome sequences will converge upon ~$1,000 in the next 5 years (they’re currently ~$20,000 or so per person).

Addendum: If you are unconvinced as to my confidence in the very low risk of biological weapons, download my genotype and send it to the I.S.I., explaining that I’m an anti-Muslim apostate with right-wing American political views. That’s true. If you want to goad them on, tell them I’m anti-Pakistani, and that I fantasize about building a Ram Temple in Islamabad. That’s not really true, but who knows what people will believe?

]]>
http://sepiamutiny.com/blog/2011/07/01/the_diaspora_an/feed/ 4
Forgotten memories of being desi http://sepiamutiny.com/blog/2011/06/18/forgotten_memor/ http://sepiamutiny.com/blog/2011/06/18/forgotten_memor/#comments Sun, 19 Jun 2011 00:26:12 +0000 Razib Khan http://sepiamutiny.com?p=6581 Continue reading ]]> Noomi_Rapace.jpgI just recently heard that The Girl with the Dragon Tattoo was being made into a film. This perplexed me because I thought there was a film adaptation of that novel! Yes, there was, but that was a Swedish production, and the new film is “made in America.” Fair enough.

What does this have to do with this weblog? The actress who plays the protagonist in the Swedish film, Noomi Rapace, had a father who was a Gitano, a Spanish Romani (the term “Roma” is really an ethnonym for the eastern Romani). In case you don’t know, the Romani language is clearly Indo-Aryan. Its closeness to Indo-Aryan dialects of the Indian subcontinent is such that the story goes that Indian sailors who were stationed in Britain overheard, and understood, much of the conversation of local British Gypsies.

The origin of this population in the Indian subcontinent is evident through multiple lines of inquiry. Both in terms of culture, and genetics. Most of the genetic results focus on paternal and maternal lineages, but some “genome bloggers” have obtained samples from people with Roma background, and they clearly have distinctive South Asian ancestry. Because of intermarriage obviously this is not always visibly salient. How many people are aware that Charlie Chaplin was 1/4 Romanichal?But this post isn’t about Romani, but another group of brown folk who have forgotten about being brown. I’m talking about the Cape Coloureds of South Africa. It is well known that this population has ancestry form local Africans, whether Khoisan or Bantu, as well as a Northern European heritage shared with Afrikaners (culturally they are somewhat interchangeable with their Afrikaner “cousins” in language and religion). Often there is also an awareness that the Cape Coloureds have some Southeast Asian ancestry, because of the ubiquity of slaves and servants from this region of the world across the Dutch colonial empire (e.g., Suriname), as well as the existence of the Cape Malays.

But what about the Indian ancestors of the Cape Coloureds? This is not so well known, despite the fact that the Dutch brought many Indian servants and slaves to South Africa as well. Simon van der Stel, the first governor of the Cape Colony and for whom the city of Stellenbosch is named, had a maternal grandmother who was an enslaved Indian.

A few years ago a paper came out which quantified the extent of Indian ancestry in a set of 20 Cape Coloureds. It looks to be about ~10 percent. More recently I obtained 3 samples of Cape Coloured origin (unrelated). I “ran” them through the program ADMIXTURE. My results were in line with what the earlier team had found. I used my “Gujarati_B” reference sample, which seems to be Patels, to explore for any South Asian ancestry. I also compared the Cape Coloureds to Chinese, a set of San (Bushmen), Bantu Africans, and white Americans, and Yemeni Jews. The Cape Coloureds had contributions from all the groups. The Chinese are a reasonable proxy for Southeast Asians on a continental scale. South Asian ancestry for the Cape Coloureds was clearly outside of the margin of error. The fact that it was approximately the same in all three individuals suggests that it this absorption of Indian ancestry occurred early on in the ethnogenesis of the community, as there is not much intra-population variance..

Cape Coloureds are 8.8% of South African’s population. Indians are 2.6%. Assuming that Cape Coloureds are ~10% Indian, one can infer that around ~1/3 of the distinctive South Asian ancestry among South Africans is actually not within the enumerated Indian population.

This is to some extent all ancient history, though I suspect people will find it moderately interesting. But, it perhaps points us to possibilities in the global future, as identities, self-conceptions, are mixed & matched, and combinations generate novel startling configurations.

Image credit: Wikimedia Commons

]]>
http://sepiamutiny.com/blog/2011/06/18/forgotten_memor/feed/ 15
South Asian genetic variation in a glance http://sepiamutiny.com/blog/2011/06/01/a_map_of_south/ http://sepiamutiny.com/blog/2011/06/01/a_map_of_south/#comments Wed, 01 Jun 2011 23:56:55 +0000 Razib Khan http://sepiamutiny.com?p=6565 Continue reading ]]> Since I began blogging here in February we’ve come a long way in getting a better sense of South Asian genetic relationships. By “we,” I’m referring mostly to Zack Ajmal of the Harappa Ancestry Project, and to a lesser extent the Dodecad Ancestry Project and the Eurogenes Genetic Ancestry Project. These explicitly amateur enterprises have taken off the shelf population genetic analytic tools, such as ADMIXTURE, and combined them with a “crowd-sourced” sampling strategy. Zack now as over 100 individuals, the vast majority of them South Asian, some from ethnicities and communities which have never been analyzed in the academic literature.

The Times of India has now taken an interest in the Harrapa Ancestry Project. I’m rather tickled by this. When I first began corresponding with Zack about the technical details of preforming this survey of South Asian genomics neither of us knew where we were going to go. The main issue we both felt needed to be addressed was of scope of sampling. In other words, there were simply too many under-sampled populations in South Asia when it came to academic analyses of the human genetics of the region.

A quick survey of a map of some participants in HAP shows that much of north-central India remains woefully under-sampled even after six months:


View Harappa Ancestry Project in a larger map

ANI+4.jpgBut these are still young days yet. So what have we found out? I outlined some of the tentative conclusions a month ago. As time passes I am coming more and more to the conclusion that the primary connection that we South Asians have with the peoples of western Eurasia is through an affinity with the broad swath of peoples between Europe and the Middle East. The plot to the left shows the reason behind my assertion. It is a two-dimensional representation of genetic distances between putative clusters. I have given concrete examples for populations which are close substitutes of these ideal types (which were constructed by disaggregating the ancestry of real populations). I have not given a population example for “Ancestral North Indians” because no such real population exists as a good proxy. As some of you know modern South Asians can be viewed to a large, but not exclusive, extent as a two-way combination between a West Eurasian affiliated group, “Ancestral North Indians” (ANI), and another population termed “Ancestral South Indians” (ASI). The ASI are closer to East Eurasians than West Eurasians, but nevertheless the relationship is much more distant to East Asians than that of ANI to other West Eurasian groups. In fact, ANI can be substituted for other West Eurasian groups without too much disruption on a world-wide scale when it comes to representation the relationships between human populations. In contrast, the closest living population to the ASI are the tribes of the Andaman Islands, who are tens of thousands of years distant from the ASI of the mainland. There are no pure ASI present today. The South Indian adivasi is 30-40% ANI, and 70-60% ASI. The Pathan is 80% ANI and 20% ASI. Most people in South Asia span the gamut. But even this is too simple. More detailed analyses often tend to suggest that there have been multiple intrusions into South Asia since the original ANI-ASI admixture event, whether it be the Southwest Asian affinities of the peoples of northwest and western India, or the clear East Asian connections of Munda tribes in northeast India. I don’t want to get into those details in this post. For more interpretation I invite you to peruse some of Thorfinn’s posts at Brown Pundits (for those of you who care, Thorfinn is a pseudonym. His family is from Gujarat Punjab and Uttar Pradesh).

What I want to do is get back to the title of the post: how do I show you some raw results in a gestalt fashion? By this, I want readers of this weblog to immediately see some general relationships and be able to place their own community into a broader South Asian and international genetic context.

Here is my crack at that task. All the raw results are form Zack’s K = 11 Reference 3 run. In plain English what it means is that Zack took his huge population data set (which runs into the thousands) and told it to evaluate all the genetic variation and allocate ancestral quanta to individuals based on the 11 informative population clusters which fell out of the data. If you want a concrete example of how this works, if you have a population of of Swedes, Nigerians, and African Americans, and set K = 2, then the Swedes would all be at 100% for one cluster, the Nigerians at 100% for another cluster, while the aggregate African American population would shake out to be at 80% in the cluster which is fixed on Nigerians and 20% in the one fixed in the Swedes. If the African American population is representative then 10% of the individuals would have greater than 50% ancestry from the cluster which is at 100% in the Swedes.

Observe that I have not named the clusters. That’s because the clusters are statistical artifacts which fall out of the patterns of variation in the data. They map onto reality, but they are not reality. After the fact it seems reasonable to label one cluster “European” and the other cluster “African,” but always remember that these are labels useful for your interpretation, but the algorithm itself is sorting the variation across the data set. In other words when looking at complex plots at higher K’s focus on the relationship between the populations and individuals, and not absolute values of ancestral quanta and labels.

All this matters because at K = 11 Zack gave the clusters plausible labels, which usually correspond to high proportions in certain populations and regions. To make it more South Asian focused I removed the non-South Asian groups except Iranians in his reference set. The reference set consists of many individuals in each population (e.g., 10 Russians for the Russian population). Additionally, many of the population clusters which are defined for Africans, Oceanians, Amerindians, and more specific East Asian populations are not relevant for South Asians. So I amalgamated the African groups into one cluster, and all the peripheral East Eurasian, Oceanian, and Amerindian ones into another. I left the primary East Asian cluster, which defines China to Southeast Asia, disaggregated, since it is somewhat informative in South Asia (e.g., both my parents are ~10% East Asian).

With all that done the clusters which remain are:

  • S Asian, common among South Asian populations, though found at lower proportions in West Asia and Southeast Asia
  • Onge, an Andaman Island tribe
  • E Asian, the cluster which defines Han Chinese, Japanese, and Southeast Asians
  • SW Asian, a Middle East focused component, but found elsewhere in proportion to distance
  • European, peaks in northeast Europe, on the Baltic
  • African
  • East Eurasian, which is an aggregate of Oceanians, Siberians, Amerindians. A “catchall” which is really noisy in South Asians

Understand that the genetic distance between these components is not the same. The European-SW Asian distance value is rather small. The East Eurasian category throws in some rather distantly related groups, but none are that important in South Asians.

So I generated a bar plot with these clusters with the reference populations. But Zack also has results for nearly 100 individuals of South Asian origin. I removed those of mixed background, kept Iranians as an outgroup, and combined their identities somewhat. So Bengali Brahmins are clustered with other Bengalis, but I note they are Brahmin. Many of the “Unspecified” people below don’t fall into a clear category, but it isn’t as if they’re totally generic. Myself and my parents are the three Bengalis with a lot of East Asian without specification, but I know I have Bengali Brahmin (paternal grandmother), Kayastha (maternal grandfather), and Middle Eastern ancestry (maternal grandmother) within memory (i.e., these origins are preserved orally or textually). Additionally, my paternal lineage has several individuals who carry the honorific Khan, though that’s not straightforwardly mapped onto any caste term.

All the reference populations, which are averages of many individuals of a given group are at the top of the bar plot and in all caps. All the rest of the bars represent individuals from HAP, sorted by ethnic/regional labels, and then specific community/caste identity if possible.

If you want the raw spreadsheet, I put the open office file online.


hap2.jpg


Notes: The Gujarati population was separated tentatively because it seems as if there are two clusters. One of them is likely affiliated with the Patel community, but the other is a grab-bag of various groups. Most of these individuals are unrelated, but I am in the list along with my parents, so don’t overweight the East Asian component in Bengalis on account of that (though my parents are unrelated, and have the same quantum of East Asian).

]]>
http://sepiamutiny.com/blog/2011/06/01/a_map_of_south/feed/ 33
Junk Science for Fun http://sepiamutiny.com/blog/2011/05/17/junk_science_fo/ http://sepiamutiny.com/blog/2011/05/17/junk_science_fo/#comments Tue, 17 May 2011 12:48:52 +0000 Pavani http://sepiamutiny.com?p=6548 Continue reading ]]> Arvind Gupta has won national awards for his many contributions to science education in India. But when he introduces himself, he calls himself a toymaker. He’s not developing the type of toys that you would buy in a store or order online. His toys are the kind people can make using trash and other everyday materials.

Got a straw? Make a flute! A couple of foam cups or a CD? Turn them into a helicopter or a hovercraft! Gupta’s videos, which I saw posted on MeFi, quickly show how to make nifty toys from trash and simple materials. They also demonstrate scientific principles in action–a coin and an old hanger are used to show centrifugal force, for example.When it comes to making learning fun, Gupta has been going at it for longer than the Mr. Wizard and Science Guy shows. Listen to him talk about how he got started during the revolutionary 1970s, as a young IIT engineering graduate who left his job making Tata trucks to join a village science program, inspired by the slogan of the times: “Go to the people. Live with them. Love them. Start from what they know. Build with them.”

]]>
http://sepiamutiny.com/blog/2011/05/17/junk_science_fo/feed/ 7
Structure within Houston Gujaratis resolved? http://sepiamutiny.com/blog/2011/04/29/structure_with/ http://sepiamutiny.com/blog/2011/04/29/structure_with/#comments Fri, 29 Apr 2011 21:24:21 +0000 Razib Khan http://sepiamutiny.com?p=6522 Continue reading ]]> guj.jpgAbout two and a half months ago I brought your attention to the fact that there is population substructure in the Gujaratis of Houston. That might sound strange, but here’s the back story. Over the past ~10 years or so there has been a project attempting to catalog common human genetic variation, known as the HapMap. The HapMap began with East Asian, West African, and European groups. But over the years it has been expanding. The first South Asian population added to the database were people of Gujarati origin in Houston, Texas. Therefore, you had a situation where in the medical genetic literature there was a lot of talk about “Gujaratis from Houston,” as if that was a group of particular importance.

The ultimate pragmatic rationale for the catalog was to allow researchers to control for ancestry when attempting to fix upon genes implicated in disease. By illustration, if Chinese have disease X at a greater frequency than Europeans, if you had a common pool of Chinese + Europeans then all the genetic variants associated with the Chinese might come up as causal, when actually it’s just a correlation with ancestry.guj2.jpgAnd this brings me to the Houston Gujaratis. One thing that jumps out at you in analyses of genetic variation of this population set is that it has substructure. That is, there are two populations within the data set. More precisely, there is one tight cluster, while the rest of the individuals vary a great deal in their genetic character. The image above is my own plotting of the variation of Chinese and the Houston Gujaratis onto a cubic space. You immediately see that there is a Chinese cluster and a Gujarati cluster, and a range of Gujaratis who fall outside of the main cluster.

Knowing what we know about the prevalence of endogamy among South Asians the immediate model which jumped out at me was that the Houston Gujarati cluster was a specific subgroup which migrated to the United States. But who? My immediate hunch was that they might be a group of Patels. Others of you suggested Bohras.

I can now report something substantive thanks to Zack Ajmal. He has some Gujarati Patels in the Harappa Ancestry Project,and they match closely with the Gujarati cluster in question. This does not exclude the possibility that the cluster consists of Bohras, and does not entail that it must be Patels. I don’t know the relationship between these various groups in Gujarat. But I think we’re getting closer to a resolution of this mystery at least.

indiaMDS_htm_31221595.jpgOf course the Gujarati HapMap cluster is not unknown to scholars. Two years ago in the supplements of the paper Reconstructing Indian History the authors observed the peculiar pattern in the principal component plots, which visualize the largest independent dimensions of variation in a data set. Most of the Indian populations fell along a line which has at one pole various groups like South Indian Dalits and at the other pole Europeans. But as the authors note a section of the Gujaratis were outside of the expected pattern. Why? Here is their hypothesis:

…Interestingly, one of the GIH subgroups fall outside the main gradient of Indian groups, suggesting that they harbor substantial ancestry that is not a simple mixture of ASI and ANI. A speculative hypothesisis that some Gujarati groups descend from the founders of the “Gurjara Pratihara” empire, which is thought to have been founded by Central Asian invaders in the 7th century A.D. and to have ruled parts of northwest India from the 7-12th centuries. I. Karve noted that endogamous groups with names like “Gurjar” are now distributed throughout the northwest of the subcontinent, and hypothesized that that they likely trace their names to this invading group.

This is wrong. The reason that a subset of Gujaratis fall outside of the main cluster is that they are a very genetically homogeneous group. This is why you exclude close relatives from these analyses; the relatives will shake out into their own clusters, which is obviously not what you want to clutter up the results. All the Gujaratis who are not in the cluster run the gamut you would expect in terms of ancestry for individuals from Central West India. Those in the distinctive cluster have a particular pattern in common.

To the left is a bar plot I generated from a selection of individuals and population from Zack’s K = 11 ADMIXTURE run. You can see the raw data in Google Docs. What K = 11 means that Zack took all the individuals in his data set, which runs into the thousands, and allowed the program to apportion them to 11 populations. These are not real populations necessarily, but abstractions. So you shouldn’t take the labels too seriously. I’ve limited it to the population components of particular relevance for South Asians. The labels in all caps are a number of individuals from public data sets. Those which are not in all caps are individuals from the Harappa Ancestry Project. I’ve constrained the individuals and populations to be somewhat informative of my overall point. What is that point? The “Patel” Gujarati cluster is among the most “pure” of South Asian populations. The Bengali to the left is my mother, and you see can see that her South Asian proportion drops mostly because of her elevated East Asian ancestry. Among the Jatts the European and Southwest Asian proportion is higher. The “Onge” components refers to an affinity with a tribe in the Andaman Islands. This, combined with the “S Asian” component is probably a good shadow of patterns of variation which denote ancestral deep roots within the Indian subcontinent. Combing the two you see the the Gujarati cluster and individuals affiliated with it top out in excess of 90%! I think this is the outcome of the ancient admixture event between “Ancestral North Indians” and “Ancestral South Indians” which defines South Asians as a distinctive genetic unit on a worldwide canvas. All those who came later, whether it be Austro-Asiatics, Aryans, or Scythians, are overlays upon this robust common substrate.

Ironically the geneticists who decided to select the Gujaratis of Houston stumbled onto a group which is archetypically representative of what it means to be South Asian in a biological sense.

]]>
http://sepiamutiny.com/blog/2011/04/29/structure_with/feed/ 18
The genetic origin of Indians http://sepiamutiny.com/blog/2011/04/22/the_genetic_ori/ http://sepiamutiny.com/blog/2011/04/22/the_genetic_ori/#comments Fri, 22 Apr 2011 21:38:56 +0000 Razib Khan http://sepiamutiny.com?p=6503 Continue reading ]]> onge2.jpgThe question of national and individual origins has a corporeal and concrete dimension, and a mythic and symbolic one. This is evident in the religious traditions which most of the world’s populations adhere to. Israel is both literally and figuratively a descent group. They issue from the tribes descended from the sons of Jacob. Those who convert into the Jewish religion customarily also convert into the Jewish nation, and so figuratively share the same descent. Similarly, among Muslims there is a particular prestige given over to the descendants of Muhammad, the Sayyids. Within Hinduism the importance of descent groups manifests generally in terms of the endogamy prevalent among South Asians, and also in specific cases, such as with gotras. The fundamental atomic basis of Confucian religious morality is arguably filial piety. Confucius’ descendants still play a prominent role in modern China promoting his ideas.

But descent also has a scientific and concrete aspect. Sometimes the mythic and scientific align. It does seem that the notional male line descendants of Genghis Khan are actually descended from one individual who flourished ~1,000 years ago. In other instances the connection is complex. Jews do seem to share common descent, but it is also evident that they have mixed greatly amongst the nations. And sometimes the inferences generated by science may warrant a reconsideration of treasured myths. Most reasonable people will probably accede to the clear overwhelming descent of South Asian Muslims from the native people of the Indian subcontinent, but the genetics clinches that. True, there is quite often a clear trace of Middle Eastern and African ancestry among the Muslims of South Asia above and beyond what may be found amongst non-Muslims, but often this component is dwarfed by a minor East Asian element which seems to warrant no cultural memory! In this post I will not address specific cases as much as a general framework. I have been talking about genetics, and to a lesser extent South Asian genetics, since 2004 on this weblog. But we know so much more now than we did then. I thought it was time for me to sit down and actually condense the current state of knowledge as best as I can. I will not address the biomedical dimension of human population genetics in this post, only the historical ones.

First, a few notes. I understand that this is a controversial and fraught topic. One major issue I have when I bring up this area of knowledge in a South Asian forum is that people accuse me of promoting models which I barely understand. What I mean is that often I have to go and look things up to figure out what people are actually accusing me of implying. I didn’t grow up in South Asia, so I don’t know the political-cultural battles too well. Please be explicit and clear in your comments, and don’t assume I can connect the dots!  Also, I’m going to apologize to some of you ahead of time for deleting your comments. I am going to track this thread and actually answer questions from interested parties, which means that I will need to shave off the noise. I won’t apologize to the people whose comments I delete because they address my comment moderation policy. Finally, I am going to use the word “Indian” from this point onward where in other cases I’d use “South Asian.” On the historical time scales that I’ll be addressing our ancestors were considered Indians (“Hindus”) by the rest of the world, and this seems a time where this clarity of terminology should trump contemporary geopolitical valences.

Why does any of this history matter? I have a hard time addressing this insofar as I have weak conditional effects based on my ancestry. By this, I mean that the details of my ancestry don’t matter much to me, except as a source of amusement or interest. I hope you don’t view me any differently if you find out that I seem to have a close genetic relationship to South Indian Dalits! (I do, probably far closer than you) You can also download a raw text file of my 23andMe v3 genotype if you want to poke around (I’ve made it public domain). But this sort of information matters for other people a great deal. I am, for example, kind of tired of listening to brown people talk about their non-Indian ancestry, whether it be Syrian Christians who claim Jewish antecedents, Jatts who claim Scythian antecedents, or Muslims who claim Arab, Turk, or Persian origins. From what I can tell reviewing the genetic data there is a grain of truth to many of these claims, but most brown people have ancestry that is overwhelmingly…brown. That’s pretty evident on our faces.

Second, I do know that finding ancestry from various groups can change how people view themselves. To give a personal example I have a friend who is a white American whose maternal grandparents were very racist against black people. After a detailed inspection of his genome it’s pretty clear that he’s ~5% African in ancestry. Some of his paternal relatives have been genotyped. This black ancestry doesn’t show up on that branch of his family tree, so by elimination it seems likely that it was his anti-black side which had black ancestry (my friend told me that as a child he thought his maternal grandfather did look a touch black, an observation triggered by their vocal racism). A story is here, which he is only beginning to explore. There is something similar in my own family. My maternal grandmother comes from a family with some distant Middle Eastern ancestry. This obviously a point of pride. But a closer look at my mother’s genome makes two things clear: first, she does have a very small proportion of Middle Eastern ancestry. This could be noise, but it seems associated with a smaller African component, which is not uncommon among people of Muslim origin in the Indian subcontinent from what I have seen. But, a much larger fraction of my mother’s genome exhibits clear derivation from Southeast Asia, perhaps from an Austro-Asiatic or Tibeto-Burman group. But there is no mention of this in my family’s oral history.

But enough! Brass tacks, who are we as brown folk? The map at the top of this post gets at a big part of the answer. It was generated by the blogger behind The Jatt Gene using results from the Harappa Ancestry Project. It shows the rough distribution of a genetic element associated with the peoples of the Andaman Islands, and found from Pakistan to Vietnam to Indonesia. What does it mean? The Harappa Ancestry Project has thousands of individuals from hundreds of populations, and hundreds of thousands of genetic markers per person. This data set was then run through the program ADMIXTURE, which breaks apart the ancestry of individuals contingent upon the variation you throw into the program and the number of ancestral populations you want it to generate, the latter defined by the parameter “K.” This is just software, a dumb algorithm, so it needs to be used with care. But to give a concrete example, consider that you have three populations in your data set:

  • White Americans
  • Black Americans
  • Nigerians

You tell ADMIXTURE to break apart the genomes of the individuals in your data set into at most two components. Two clusters if you will. The result in this case is going to be straightforward:

  • The White Americans will be in one cluster
  • The Nigerians in the other
  • The black Americans will be a mix, with an average admixture fraction of 80% and 20%

The program is easy to interpret in this case, as we have a history, as well as other lines of evidence, to interpret these results. One component is clearly African ancestry, and the other is European. African Americans are on average 80% African and 20% European. So ADMIXTURE nicely popped out with that result.

What does ADMIXTURE tells us about South Asians? First, it depends on what reference populations you use and how many clusters you want it to generate. I’ve addressed this detail before. But the Harappa Ancestry Project has lots of Indian populations. What you immediately see is that at higher K values a “South Asian” cluster breaks out. This cluster has the highest frequencies in southern and eastern India. It drops off as one moves west to Iran and east to Southeast Asia. Case closed?

Not quite. ADMIXTURE is a computer program. It can give strange results. It does not tell us reality, it tells us the the result of an algorithm. The “South Asian” cluster exhibits some peculiarities in terms of how it relates to other groups which can not be easily explained by history. I won’t get into the details of that, but move to the main issue: deeper analytic techniques as well as moving up K’s allows the “South Asian” cluster to fractionate into two dominant components. The major insight was unveiled nearly two years ago in a paper published in Nature, Reconstructing Indian population history:

India has been underrepresented in genome-wide surveys of human variation. We analyse 25 diverse groups in India to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most Indians today. One, the ‘Ancestral North Indians’ (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, whereas the other, the ‘Ancestral South Indians’ (ASI), is as distinct from ANI and East Asians as they are from each other. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39-71% in most Indian groups, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India. However, the indigenous Andaman Islanders are unique in being ASI-related groups without ANI ancestry. Allele frequency differences between groups in India are larger than in Europe, reflecting strong founder effects whose signatures have been maintained for thousands of years owing to endogamy. We therefore predict that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically.

(ungated copy of the paper)

Using public data sets multiple bloggers have replicated the general shape of these results. The Harappa Ancestry Project has several populations from the Andaman Islands, and at K = 11 a component which is fixed in the Onge tribe correlates almost perfectly with the ANI/ASI ratios from the above paper.

Here’s the short of it: Indians are hybrids between two ancient and very distinctive groups. If you want to know more details, I posted about it on my science blog. The top line is that the ANI is very much like Middle Eastern and European populations. In fact, ANI seems no closer to the ASI than these two other groups. Who were the ASI? The Andaman Islanders are their distant cousins, separated for tens of thousands of years. But the most current genomics shows a clear submerged substrate from the Indian subcontinent into Southeast Asia. Coincidentally Southeast Asia has been strongly influenced by Indian culture. The ASI were closer to the populations of East Asia than to those of West Eurasia. Probably in part because East Asian populations are daughter groups from the modern humans who entered the Indian subcontinent from Africa tens of thousands of years ago. But the ASI are also quite distinct from East Asians. In some ways they represent a southern Eurasian population which seems to have been submerged within the last 10,000 years.

You can see shadows of their influence in this three dimensional visualization of genetic variation. Each point below is an individual projected onto a three dimensional space which is generated by the three largest components of variance within the data. The geographical clustering is pretty straightforward, but notice the “kink” in the South and Southeast Asians. That’s ASI’s shadow:

I just threw a lot out there for you to process. These results are pretty robust though. They’re based on hundreds of thousands of markers and there’s good population coverage. But their interpretation is more problematic. That’s because we don’t have records from prehistory. We are literally grappling with shadows. So let me address a few possibilities, and give my own take. All of these assertions are far less robust than what has come before because they are synthetic. They go beyond genomics, though they operate within the constraints that the new genomics imposes upon us.

  • Who were the ANI? I think they derive from a set of farming populations from between the Black Sea and the Caspian. The reason I think this is that there are suggestive associations with populations around the Caucasus with Indian groups, even more than with Iranians! This sort of “geographic leapfrog” requires a macrohistorical explanation.

  • Were the ANI Aryans? I don’t think so. The admixture event with ASI is very old. Likely within the last 10,000 years, but probably older than 4,000 years (I know this from personal communication with one of the researchers who attempted linkage disequilibrium decay based time-from-admixture tests). Some of the Caucasian groups which have an affinity with Indians are not Indo-European speaking.

  • So why did ANI arrive in India? I think it has to do with farming. Recent evidence is now pointing to massive reconfigurations of genetic variation across the world in the past 10,000 years. We have semi-historical evidence for nearly total replacement in Japan and Africa. But there is now a great deal of circumstantial evidence that the same occurred in Europe, at least once, and probably more than once. The ANI were one of the great farming Diasporas to pulse out of the Near East.

  • But why didn’t they replace ASI? I am not an archaeologist, so I am on weak ground here insofar as I’m relying heavily on others who know this stuff. But I suspect that the indigenous populations of the Indian subcontinent themselves had started an independent transition to farming. The ANI-ASI synthesis, both genetic and cultural, was that of two incipient farming toolkits. In contrast the relatives of the ASI in Southeast Asia did not enter into an independent phase of farming, and were marginalized to a far greater extent by populations from southern China (the exceptions being the Papuans). The Andaman Islanders then are exceptions, and not representative in their hunter-gatherer lifestyle.

  • What about the Aryans? The data from Europe is far thicker than from the Indian subcontinent, and there there is evidence for multiple movements and cultural influences. I believe that the Indo-Aryans arrived later, and are a minor overlay upon the ANI-ASI synthesis (South Indian tribals have 30-40% ANI, indicating how old and thoroughgoing the synthesis was). Some speculative suggestions can be made from the genetic data in regards to a post-ANI West Eurasian influence which does not seem Middle Eastern. I will leave that for now because we just don’t have much to go on, though I do suggest that one keep track of The Jatt Gene. I think the answers we’ve long been waiting for will be coming soon, especially with the imminent release of Indian populations from the 1000 Genomes.

  • The northwest-southeast axis is the dominant genetic story of India, but not the only one. There is a northeast-southwest axis. It seems probable that the Munda are relative newcomers as well. Though mostly Indian, there is an element of ancestry in these populations which suggests relatively recent affinities with East Asians. This is probably at least part of my personal story, so I take an interest in this “third wheel” component of our heritage.

  • South Indian Brahmins claim northern Indo-Aryan origins. The genetics certainly bear this out, albeit with some probable admixture with the local substrate. There are many specific questions which can be asked and answered. The Cochin and Bene Israel Jews of the west coast of India clearly do have highly elevated Middle Eastern components of ancestry, though they are highly admixed with the native populations. My own question: do the Nasrani Christians truly descend from Jews? I would have dismissed this outright a few months ago, but I am not sure sure now. The western coast of India seems to have long-standing connections to southern Arabia, so we need to flesh out these patterns in more detail.

What’s the biggest surprise from these results? For me I think it is the deep and incredibly thorough biological synthesis which characterizes the Indian subcontinent. We all know that there is a big difference between a Kashmiri Pandit and an Adivasi from South India. But about one third of the Pandit’s ancestry is “Ancestral South Indian,” which is almost absent outside of the subcontinent. And about one third of the Adivasi’s ancestry is “Ancestral North Indian,” which connects this individual with the populations which span the Atlantic, to the Urals, to the Sahara. The past is a strange and mysterious land. But the veil of ignorance is slowly lifting….

Note: Some might wonder why I didn’t address uniparental lineages. The post is long, that’s why. The short of it is that ASI seems to have a much stronger impact on maternal lineages, while ANI is more dominant in paternal ones. Additionally, among the Munda the East Asian element is far more frequent on the paternal lineages than the maternal ones. This indicates a consistent trend of deep time events of sex-biased migration.

]]>
http://sepiamutiny.com/blog/2011/04/22/the_genetic_ori/feed/ 89
A brown twist on personal genomics http://sepiamutiny.com/blog/2011/04/12/a_brown_twist_o/ http://sepiamutiny.com/blog/2011/04/12/a_brown_twist_o/#comments Tue, 12 Apr 2011 21:09:04 +0000 Razib Khan http://sepiamutiny.com?p=6483 Continue reading ]]> I know that many people took advantage of the 23andMe sale I highlighted on Sunday. I also know that a fair number of these were brown, as I also have a list of people who I emailed, and several South Asians confirmed that they’d purchased the 23andMe kit. What do you get if you purchase this kit? Basically 1 million markers, SNPs, which are simply population-wide variant positions within your genome. These markers were chosen because variation is often informative, in terms of traits, as well as ancestry.

But obviously you are not going to just be looking at a string of letters. The data has to be analyzed for you. 23andMe provides a range of tools in this domain. But, one needs to use them cautiously, and also understand their limitations. In particular, these tools were often tuned for a specific set of populations which does not include South Asians. So some of the results are going to strike you as strange.

First, let’s hit the easy stuff. Health and traits.traits1.pngWhen you first go into your account you’ll see a list of options on the left. To the left is a screenshot of my own account. I’ll be using it as an example from this point on. The services basically fall into two categories: health and ancestry. Health itself is broken down into the disease risk, carrier status, drug response, traits, and health labs. The two that are generally of any interest in my experience are disease risk and traits. Carrier status is something that is important for potential parents, but you’ll probably already get screened. Again, with drug response you should already know this because of your medical history. Finally, health labs is experimental stuff which I haven’t found of interest (does it matter how much of your weight is due to your genetic risk?).

Let’s start out with disease risk. I’m 90% confident that most of you will not find anything particularly shocking, novel, or actionable. By this, I mean that you’ll probably find that you’re at a high risk for diseases which already run in your family, or, that you are at a marginally higher risk for a disease that don’t run in your family. There are exceptions. So that’s why I’m pegging it at 90%, a minority of people seem to find something genuinely novel, and unfortunately not in a good way.

If you do fall into the category of finding out some risk which you hadn’t expected, don’t freak out. Please make sure to read up on how these estimates are calculated. These are odds, and lack of family history is very important information.

As I noted above these tools are fine-tuned to particular populations. Most of the disease risk estimates are explicitly for Europeans. That’s because studies are generally done on Europeans. There are simply very few studies done on South Asians as a relative proportion. What this may mean is that if you have a risk due to a particular genetic variant, that risk may only hold for Europeans. The main caveat I would offer is that I am becoming convinced that this is less of an issue than had previously been thought. In other words, disease risk assessed in one population may be robustly inferred to another, more often than not.

For what it’ worth, 23andMe tells me that I am at typical risk for type 2 diabetes. This is incorrect. I have family information, and my risk is higher than typical.

Most of the traits which you have a genetic predisposition for will not be surprising to you. I do not have an alcohol flush reaction. My earwax is wet. My eyes are brown. But, some of the traits are of interest. I am a PTC non-taster. On the last one I knew this, but that’s because this is a basic high school genetics test. I knew I was a non-taster. A large minority of humans are, and from what I have seen non-taster status may be a majority among South Asians. It is a recessive trait, which means that if you have one functional copy you are a taster. Some scholars have suggested that people with two functional copies are “super-tasters.” What does this mean practically? Generally non-tasters have reduced responses to bitters, and often higher satiety thresholds to fat. Non-tasters often like vegetables. If you are a potential parent this genotype is probably important, as you can estimate the range of outcomes in your offspring.

Some of the traits which you wouldn’t know of off the topic of your head hopefully will never matter. I do not have have resistance to HIV progression, for example. This is a trait which is most common among Europeans, so this makes sense. I have friends of European ancestry who have some resistance. I do hope it does not change their behavior too much!

Moving on to ancestry, there are far fewer caveats. Disease risk assessment is not quite “prime time,” but ancestry inference is. That’s because a million markers is a whole lot just to estimate ancestry, which requires a representative snapshot of your genome. But for South Asians 23andMe’s tools leave something to be desired.

Below is my “ancestry painting.” I am apparently 60% European and 40% Asian. You see a chromosome by chromosome color-coding of my European and Asian ancestry.

painting.jpg

Does this make sense? Yes and no. 23andMe uses three “reference” populations. Whites from Utah, Yoruba from Nigeria, and Han Chinese from Beijing. What their algorithm does is take your genetic variation, and see how it relates to these three populations, and construct a set of affinities genomic region by genomic region. This works well if your mother is black American and your father is white European. You will be given plausible results. It does not work so well when you have populations which are very different from the reference groups. For example, Somalis or South Asians. In the constraints of the program the results make sense. In the real world you are scratching your head.

For example, South Asians tend to be a mix of European and Asian according to this program (with many Pakistanis showing trace levels of African). The interval is 10 percent to 40 percent, with the vast majority of people in the 10 to 35 range, with an average around 25. The fact that I was at 40 percent was notable. Generally Punjabis are at the lower Asian fraction, while South Indians and Bengalis are at the higher Asian fractions. But my own results were the highest of anyone I could find. I strongly suspected that this indicated recent Asian admixture on top of the range you see in 23andMe’s algorithm for people of South Asian descent.

pca.pngThe PCA to the left also suggests that. What you see is a two dimensional visualization of the genetic variation of Central and South Asian populations. I’m the green individual. As you can see, I’m on the edge of the main South Asian cluster, toward the Hazara and Uyghur. These are two Central Asian populations with clear East Asian admixture. Aside from my family the other individuals in this area who are from the Indian subcontinent tend to be Bengalis, indicating the genetic affinities of this group with East Asians to a greater extent than among other Indians.

Like the ancestry painting one has to be careful of the PCA. Some of the individuals who are close in position to me have an East Asian parent and a European parent. These Eurasians are near some of the South Asians because their genetic combination places them in the same area when you visualize them on a two dimensional axis. But obviously there is a huge difference between Eurasians and South Asians. These tools of analysis and visualization can’t just be take literally, they need to be understood in their proper light.

There’s much more on the 23andMe site, but if you are a real nerd the best thing is that they give you is your raw data. That’s how I confirmed that my elevated Asian ancestry was probably due to relatively recent East Asian admixture. If you are not comfortable in the Linux environment, and want more detailed breakdowns, I’d suggest submitting to the Harappa Ancestry Project. It has a very large and robust South Asian data set to compare you against. But, if you want to do get your own hands dirty, read on.

First, you need to get ADMIXTURE. My tutorial will be helpful, but I’ve decided to make it even easier. I’ve created a file with French, Gujarati, Pathan, Chinese, and Yoruba, individuals. The Gujaratis have been screened for a high fraction of South Asian ancestry. I’ve zipped up the file here. It also has a “master list” of the individuals and their population assignments. What you probably want to do is:

1) get your own data

2) merge it with the file that I created for you

3) run ADMIXTURE, and see how you stack up

After you get your data, convert it to pedigree format. Here’s a script to do so in Perl. You can do this:

perl convert.pl "YourFileName" "001" "001"

Download Plink. Make your pedigree file binary:

./plink --file YourFileName --make-bed --out YourFileName

Now you want to merge it with the pedigree file I gave you:

./plink --bfile YourFileName --bmerge Brown.bed  Brown.bim  Brown.fam --make-bed --out Brown

You probably want to filter the SNPs:

./plink --bfile Brown --geno 0.01 --make-bed --out Brown

OK, so now you’re added to the file. You want to run ADMIXTURE:

./admixture Brown.bed 4

This will generate four populations. What you want.

I gave you a file which will be somewhat informative if you are brown. I ran the above file for myself, and you can see me right between the second cluster of Chinese and Gujaratis. Note the East Asian slice.

admixture.png

]]>
http://sepiamutiny.com/blog/2011/04/12/a_brown_twist_o/feed/ 4