Why I don’t accept the para-Munda hypothesis

There has been a discussion of Michael Witzel’s ideas in the comments below. Long familiar with his thesis that a Munda-like language was dominant in the northern Indus valley and in the Gangetic plain, I have also been long skeptical of it.

The reason for me is simple: I have leaned to the position that Munda are intrusive from Southeast Asia. Over the past 10 years my confidence in this proposition as grown. Let’s review

1) They speak an Austro-Asiatic language. Most Austro-Asiatic languages are in Southeast Asia and seem to have spread from the north to the south

2) The Munda have genetic signatures on the Y chromosome and some of their traits which are distinctive to East Asians and totally unrelated to any other South Asians. These genetic signatures are not found in South Asia outside of the Munda areas, and northeast India (i.e., they are not present in the Indus or Gangetic plains).

3) The most common Y chromosome of the Munda seems to be from Southeast Asia. That is, Southeast Asian lineages are basal and more diverse than the ones in India.

4) Genetic data from ancient DNA indicate that Austro-Asiatic people did not arrive in northern Vietnam until 4,000 years ago. To me this, this implies they arrived in India well after 4,000 years ago.

5) We now suspect that Indo-Aryans arrived well after 4,000 years ago to the Indus valley. The Munda and Indo-Aryans could not have met in that region 3,500 years ago in any reasonable scenario.

Let’s assume that Witzel and others are correct that the early Indo-Aryans and the languages/toponyms of the Gangetic plains do not show Dravidian influence. How could that be? It could be that in the northern Indus valley a non-Dravidian language was dominant. Consider Burusho, a linguistic isolate. Mesopotamia was long divided between a Semitic north and a Sumerian south.

Second, the genetic data seem to suggest that some Indo-Aryan groups have more AASI and more steppe than groups to their west. North Indian Brahmins vs. Sindhis are an example. To me, this is indicative of the possibility that the Indo-Aryans pushed past areas where Dravidian languages were dominant, and only AASI hunter-gatherers were flourishing. The lack of a Dravidian substrate is because the AASI groups the Indo-Aryans encountered were not Dravidian speakers.


Rakhigarhi sneak-peaks

Over at my other weblog, noting that the Indian press is finally starting to simply report the substantive contents of the Rakhigarhi results. As we all know the media can distort and misrepresent, so we need to be cautious and wait on the final paper, mostly because with that the authors can speak freely and without intermediation. But, I have heard through the grapevine the general results, and the results are exactly what Outlook India is currently reporting.

The Rakhigarhi samples themselves aren’t that interesting to me. But, Niraj Rai seems to be pushing the admixture event with IndoA-Aryans after 1500 BC. This could be a misquote, or, it could be that the researchers from various groups now have enough data to fine-tune their parameters so as to narrow down various admixture timing events.

Ancient Ancestral South “Indians” may have roots in Southeast Asia

At the Society for Molecular Biology and Evolution conference in Japan there is a presentation which reports evidence for gene flow from Pleistocene Southeast Asians into South Asia. I have long suggested this was possible for several reasons.

During the Last Glacial Maximum ~20,000 years ago Southeast Asia would have been a relatively protected and well-watered region in comparison to South Asia. My understanding is that moist savanna has higher population densities of hunter-gatherers than dry scrubland. Southeast Asia would have had a great deal of the former, and almost none of the latter (the LGM was drier, and the rainforest zone in Southeast Asia would have been smaller, and Sundaland was probably mostly savanna). The Thar desert zone would have been much more expansive, pushing south and east. The summer monsoons were far weaker.

All this indicates Southeast Asia would have had larger populations than South Asia during this period. And large populations tend to impact smaller populations genetically.

Additionally, looking closely at haplogroup M, which is highly diverse in South Asia, some of them look to be intrusive and related to branches in Southeast Asia. Though I do believe some of the M branches in South Asia are very old and probably native, others may have been brought by Southeast Asian people related to the Hoabinhian culture (which was mostly absorbed by rice farmers from the north during the Holocene).

During the Pleistocene Southeast Asia and Southern Asia were probably part of the same biogeographic zone, just as they are today. The ancestors and relatives of the Negrito peoples of Southeast Asia probably displayed a continuity from South Asia down toward Oceania. The preponderant gene flow at some points from the east to the was probably just a function of population size and climate.

Today the genetic differences on the border between South and Southeast Asia are striking. Though Pathans and Punjabis are quite different, they are far closer genetically than Bengalis and Burmese (notably, linguistically the chasm is also far greater). I think that has partly to do with agricultural and sedentarism. The mountainous zones in northeast India and western Burma are far harder for farmers to traverse than small groups of hunter-gatherers.

Bangladeshis are very East Asian, Sri Lankan Tamils are not quite as structured

Click to enlarge

A very long post as my other weblog where I reiterate how East Asian Bengalis, and in particular East Bengalis, are. Aside from the existence of a Dalit/scheduled caste subcommunity, very little has surprised me about Bangladeshi genetics in the last 5 years or so. Rather than a novelty, some simple truths seem to be reinforced over and over. Two major takeaways:

1) the only “exotic” aspect of Bengali ancestry is that Bengalis are substantially East Asian (with the exception that this is sharply attenuated in Brahmins).

2) Though there is some evidence of West Asian admixture in a few Bengali Muslims, you have to look really close to see evidence of it. Though I can believe and do believe, that many Bengali Muslims have a genealogical connection to Iran and Turan through a distinct paternal lineage, that has left a minimal genetic impact.

But one thing I did not emphasize in the post: looking closely at the 1000 Genomes Sri Lankan Tamil samples from the UK I think it is clear that they are less structured than an Indian sample would be. The proportion of Dalits is far lower than in the Indian Telugu sample obtained from the UK. So I will have to update my assertion that the Sri Lanka Tamil sample is as structured as Indians. It isn’t. This is contrast to the Lahore Punjabi samples, which are highly structured. More so than the Sri Lanka Tamils.

Bhadralok are made not born

Tanushree Dutta is a Bengali Kayastha

I have two samples of full ancestry from West Bengal. A Kayastha and a Brahmin. You can see where they plot.

Bengali Brahmins are very similar to North Indian Brahmins (often they have some “eastward” shift). In contrast, the Kayastha individual looks like the Bangladeshi samples, except with far less East Asian ancestry.

I do want more samples. Though I’ve gotten a few Bengali Brahmins and they exhibit the sample pattern as above. I am curious about non-Brahmin West Bengalis. But from the above, I think I will conclude that the hypothesis that Kayasthas are a cultivator caste which uplifted themselves occupationally is probably the right one.

Genetic variation in South Asia

I don’t have too much time right now. So a quick data post. The map above shows India’s scale in relation to Europe.

Below is an NJ tree that shows pairwise Fst values (genetic distance):

Please notice the small genetic difference between Britain/Spain/Poland. Compare to Gujrati vs. Sindhi, let alone Gujrati vs. Telegu.

Now, PCA:

Genetically Sindhis occupy a place between South Indians and Iranians. Some Gujaratis are nearly where Sindhis are, but many are far more shifted toward South Indians. The Fst display masks this since it aggregates populations.

Treemix shows the relationships and their scale. South Asians have a lot of drift between them.

Some of you are probably bored by this post and wonder about it’s practical implication. If so, keep on paging down (or up).

Genetical observations on caste

One of the more interesting and definite aspects of David Reich’s Who We Are and How We Got Here is on caste. In short, it looks like most Indian jatis have been genetically endogamous for ~2,000 years, and, varna groups exhibit some consistent genetic differences.

This is relevant because it makes the social constructionist view rather untenable. The genetic distinctiveness of jati groups is very hard to deny, it jumps out of the data. The assertions about varna are fuzzier. But, on the whole Brahmins across South Asia have the most ancestry from ancient “steppe” groups, while Dalits across South Asia have the least. Kshatriya is closer to Brahmins. Vaisya has lower fractions of “steppe”. And so on. These varna generalizations aren’t as clear and distinct as jati endogamy. Sudras from Punjab may have as much or more “steppe” than South Indian Brahmins. But the coarse patterns are striking.

As a geneticist, and as an irreligious atheist, a lot of the conversations about “caste” are irrelevant to me. They’re semantical.

You can tell me that true Hinduism doesn’t have caste, that it was “invented” by Westerners. They may not have had caste, but the genetical data is clear that South Asians were endogamous for 2,000 years to an extreme degree. Additionally, the classical caste hierarchy seems to correlate with particular ancestry fractions.

Second, you can say Islam, Sikhism, Jainism, and Buddhism don’t have caste. That they picked it up from Hinduism. Or Indian culture. That’s true. But I think Islam, Sikhism, Jainism, and Buddhism are all made up, just like Hinduism. I don’t care if made up ideologies don’t have caste in their made up religious system. I am curious about the revealed patterns genetically.

I have a pretty big data set of South Asians. Some of them are from the 1000 Genomes. Here is where the 1000 Genomes South Asians were collected:

Gujarati Indians from Houston, Texas
Punjabi from Lahore, Pakistan
Bengali from Dhaka, Bangladesh
Sri Lankan Tamil from the UK
Indian Telugu from the UK

Some of the groups showed a lot of genetic variation, so I split them based on how much “Ancestral North Indian” (ANI) they had. So Gujurati_ANI_1 has more ANI than Gujurati_ANI_2 and so forth.

Continue reading “Genetical observations on caste”

Intellectual Dark Web

I would define the “intellectual dark web” as the confluence and convergence of leaders from classical European enlightenment, hard sciences, technology (including neuroscience, bio-engineering, genetics, artificial intelligence), and east philosophy streams. Among the intellectual dark web’s many members are Dr. Richard Haier, Jordan Peterson, Jonathan Haidt, Ben Shapiro, Weinstein brothers, Sam Harris, Glenn Loury, John McWhorter, Yuval Noah Harari, Thomas Friedman, Maajid Nawaz, Neil deGrasse Tyson, Michio Kaku , Dr. VS Ramachandran, Steven Pinker, Armin Navabi, Ali Rizvi, Farhan Qureshi, Peter Beinart, Gad Saad, Nassim Nicholas Taleb, Dave Rubin, Joe Rogan, Russell Brand.  If Steve Jobs were still alive, I would include him among them. They defy easy labels and are high on openness. I hesitate to label others without their permission, but our very own Razib Khan strikes me as a potential leader of the “intellectual dark web”; although I will withdraw this nomination if he wishes. 😉

Some see the intellectual dark web as the primary global resistance to post modernism. I don’t agree. Rather I see them as ideation and intuition leaders thinking different:

Continue reading “Intellectual Dark Web”

Closing the genetic chapter

Indus Valley People Did Not Have Genetic Contribution From The Steppes: Head Of Ancient DNA Lab Testing Rakhigarhi Samples:

In other words, the preprint observes that the migration from the steppes to South Asia was the source of the Indo-European languages in the subcontinent. Commenting on this, Rai said, “any model of migration of Indo-Europeans from South Asia simply cannot fit the data that is now available.”

Some more comments at my other weblog.

At this point, we need to move to other things. I think the broad genetic framework is pretty clear.

1) The Indus Valley Civilization (IVC) people were a mix of eastern West Asian (from modern Iran) people and native South Asian peoples (~80% of South Asian mtDNA are haplogroup M).

2) ~1500 BC a major incursion from the steppe occurred and overlaid upon #1 to various extents as a function of region, language, and caste.

3) ~0 to 500 AD the strong endogamy that characterizes modern South Asians seems to have established itself.