Kashmiris are just generic North Indians, and there is no difference between Pandits and Muslim Kashmiris


Since people ask me this I have to post this now and then. We have genetic data. So in short order

1) Kashmiris are like other people in the Northwest of India. They are not enriched in steppe ancestry, at least compared to many Punjabis or Brahmins from the Gangetic plain

2) There is no genetic difference I can see between Pandits and Kashmiri Muslims, indicating to me that one distinctive aspect of the Vale of Kashmir in comparison to the rest of the Indian subcontinent is that it does not exhibit the jati-varna structure common across the subcontinent

3) Some researchers and genetic genealogists have found some Tibetan admixture at low levels among Kashmiri Muslims and Pandits

4) It is probably correct that elite Muslims have low levels of Central Asian and Iranian ancestry, though that’s harder to detect than Tibetan background

Hayagriva was a Sintashta!


The paper is not out, but since the data has been uploaded they posted the abstract for the world to see, Project: PRJEB44430:

Horse domestication fundamentally transformed long-range mobility and warfare. However, modern domesticates do not descend from the earliest domestic horse lineage associated with archaeological evidence of bridling, milking and corralling at Botai, Central Asia ~3,500 BCE (Before Common Era). Other long-standing candidate regions for horse domestication, such as Iberia and Anatolia, were also recently challenged. Therefore, the genetic, geographic and temporal origins of modern domestic horses remained unknown. Here, we pinpoint the Western Eurasian steppes, especially the lower Volga-Don region, as the homeland of modern domestic horses. Furthermore, we map the population changes accompanying domestication from 273 ancient horse genomes. This reveals that modern domestic horses ultimately replaced almost all other local populations as they rapidly expanded across Eurasia from ~2,000 BCE, synchronously with equestrian material culture, including Sintashta spoke-wheeled chariots. We find that equestrianism involved strong selection for critical locomotor and behavioral adaptations at the GSDMC and ZFPM1 genes. Our results reject the commonly held association between horseback riding and the massive expansion of Yamnaya steppe pastoralists into Europe ~3,000 BCE driving the spread of Indo-European languages. This contrasts with the situation in Asia where Indo-Iranian languages, chariots and horses spread together, following the early second millennium BCE Sintashta culture.

If you ever inspect the domestic horse lineages you will note that they’re a monophyletic clade. They are recently descended from a common ancestor. Additionally, there is a massive skew in stallion lineages toward a few breeders. Ancient DNA has now solved the question of which prehistoric horse population the modern domestic breeds descend from: the horses from the eastern edge of the post-Yamnaya cultural zone.

Nepali Brahmins tend to have Tibeto-Burman ancestry


I ran a Clubhouse last night on Nepalese genetics. I said something to the effect that most Nepalese Brahmins have Tibetan admixture. A Nepalese Brahmin came up on stage to tell me this was inaccurate, and that they did not intermarry with native people.

To give the benefit of the doubt I went back and double-checked, and Toward a more uniform sampling of human genetic diversity: A survey of worldwide populations by high-density genotyping, which has a diverse set of Nepalese. What you see on the PCA is pretty straightforward. Except for the Madeshi, who is presumed to descend from recent migrants from India, all the Nepalese are Tibetan shifted.

The rank order is what you’d expect, with the Magar being mostly Tibetan, and the Brahmins being mostly non-Tibetan. But the Nepalese guy was totally full of shit. I’m sick of listening to people contradict genetics when it’s so clear.

Eastern Y Chromosomes in the Indian subcontinent

Looking at the Y chromosomes in the Indian subcontinent, it seems that haplogroups C (found in lots of Patels) and F are the only ones with “eastern” affinity that deeply rooted in the subcontinent. Thoughts? H is found in a lot of Adivasi, but seems more related to West Eurasian populations.

This is on my mind because the Uralic populations show the strong male-based spread of eastern Y chromosomes. Finns are 60% eastern on the Y and less than 1% on the mtDNA.

How much steppe is there in Pakistan?

In the annoying dick-swinging competition that are the comments-board, someone asserted Pakistanis have a lot of steppe even on the maternal side. Really?

We have Sintashta mtDNA and the discordance was shocking to me. But there are some groups in Pakistan with detectable Sintashta mtDNA. These samples from Hazara, Kho, Pashtun, Kashmiri,and Kalash. They identify 8.4% steppe mtDNA. Pakistan as a whole has a lot more “West Eurasian” mtDNA, but that’s obviously due to the legacy of the IVC. Anyway, Complete mitogenomes document substantial genetic contribution from the Eurasian Steppe into northern Pakistani Indo-Iranian speakers:

In summary, based on available archeological and high-resolution mitogenome data from northwestern Pakistan, especially from Iranian and Dardic populations, who are suggested to be the surviving traces of early Indo-Iranian groups, we identified the genetic contributions of different dispersals from west Eurasia into northern Pakistan during the Bronze Age onward. Importantly, we identified five haplogroups as the genetic legacy of IE speakers from the Eurasian Steppe, likely dispersed along with the migration of IE-speaking populations during the Bronze Age into northern Pakistan, thus implying that IE language expansion into South Asia was not simply mediated by cultural diffusion. This migration contributed 8.4% of the gene pool of northern Pakistani IE speakers, suggesting this demographic connection, which is a possible source of IE language diffusion, could be one part of the complex demographic history of the region. Our results also provide implications on the two main hypotheses of IE language origination, viz. Anatolia and Steppe hypotheses. Considering that Steppe components were observed in all Indo-Iranian groups in northern Pakistan in our study, as well as in other regions in South Asia [10], while lineages possibly representing the genetic legacy of Neolithic farmers, e.g., R2e, K1, were either absent or not found in all of the IE-speaking groups in northern Pakistan, our results lend more support to the Steppe hypothesis, at least from a matrilineal perspective. Furthermore, these IE speakers, as evidenced by the genetic legacy identified here, also moved southward and contributed genetically, though to a rather limited extent, to the Indian subcontinent, suggesting northern Pakistan as a corridor in the spread of IE languages during the Bronze Age dispersals into South Asia. Since our study is only based on mtDNA data, which only reflect maternal histories of populations, more investigations based on genome-wide data are also needed to intensively dissect the expansion of IE speakers into South Asia.

Steppe lineages in northern Pakisan

This is not the most important paper, but it is a contribution: Complete mitogenomes document substantial genetic contribution from the Eurasian Steppe into northern Pakistani Indo-Iranian speakers. Abstract:

To elucidate whether Bronze Age population dispersals from the Eurasian Steppe to South Asia contributed to the gene pool of Indo-Iranian-speaking groups, we analyzed 19,568 mitochondrial DNA (mtDNA) sequences from northern Pakistani and surrounding populations, including 213 newly generated mitochondrial genomes (mitogenomes) from Iranian and Dardic groups, both speakers from the ancient Indo-Iranian branch in northern Pakistan. Our results showed that 23% of mtDNA lineages with west Eurasian origin arose in situ in northern Pakistan since ~5000 years ago (kya), a time depth very close to the documented Indo-European dispersals into South Asia during the Bronze Age. Together with ancient mitogenomes from western Eurasia since the Neolithic, we identified five haplogroups (~8.4% of maternal gene pool) with roots in the Steppe region and subbranches arising (age ~5–2 kya old) in northern Pakistan as genetic legacies of Indo-Iranian speakers. Some of these haplogroups, such as W3a1b that have been found in the ancient samples from the late Bronze Age to the Iron Age period individuals of Swat Valley northern Pakistan, even have sub-lineages (age ~4 kya old) in the southern subcontinent, consistent with the southward spread of Indo-Iranian languages. By showing that substantial genetic components of Indo-Iranian speakers in northern Pakistan can be traced to Bronze Age in the Steppe region, our study suggests a demographic link with the spread of Indo-Iranian languages, and further highlights the corridor role of northern Pakistan in the southward dispersal of Indo-Iranian-speaking groups.

Don’t focus on the percentages too much. Rather, focus on the coalescence estimate. Basically, that indicates diversification and demographic expansion. The presence in the southern subcontinent is indicative of the fact that “steppe” ancestry and cultural influence extends far beyond the distribution of modern Indo-Aryan languages. R1a we know, as it is found in adivasis. And low fractions of steppe are found in most South Indian groups (but not all).

The Genetics of India Cloubhouse Event – Friday 9 PM CDT

I am hosting a Clubhouse room this Friday, 9 PM CDT (8:30 AM in India on Saturday). The topic will be the genetics of India, and I’ll be talking about my two posts on Substack:

The Stark Truth About Aryans

The Stark Truth About Humans

It’s basically going to be an interactive discussion. My friend David Mittelman will help me moderate (probably others too).

You have to have a Clubhouse account (iPhone only). If you want to follow me on Clubhouse, I’m @razibkhan just like on Twitter.

Why Scythians, Sakas, and Kushanas, are NOT the source of “steppe” ancestry

This is a common question/assertion in the comments pretty much every other week: why couldn’t the documented incursions of nomadic people in the first millennium A.D. be responsible for the steppe ancestry? There is actually a good explanation in The formation of human populations in South and Central Asia, so I’ll quote it:

By the Late Bronze Age, ESHG-related admixture became ubiquitous, as documented by our time transect from Kazakhstan
and ancient DNA data from the Iron Age and from later periods in Turan and the Central Steppe, including Scythians, Sarmatians, Kushans, and Huns (29, 52). Thus, these first millennium BCE to first millennium CE archaeological cultures with documented cultural and political impacts on South Asia cannot be important sources for the Steppe pastoralist–related ancestry widespread in South Asia today (because present-day South Asians have too little East Asian–related ancestry to be consistent with deriving from these groups), providing an example of how genetic data can rule out scenarios that are plausible on the basis of the archaeological and historical evidence alone (13) (fig. S52). Instead, our analysis shows that the only plausible source for the Steppe ancestry is Steppe Middle to Late Bronze Age groups, who not only fit as a source for South Asia but who we also document as having spread into Turan and mixed with BMAC-related individuals at sites in Kazakhstan in this period. Taken together, these results identify a narrow time window (first half of the second millennium BCE) when the Steppe ancestry that is widespread today in South Asia must have arrived.

There is now a large database of Scythian, etc., ancient DNA, thanks to the preservation conditions on the Eurasian steppe. Most of their ancestry derives from the same broad group as the Andronovo horizon of which the Sintashta were part. But, unlike the earlier steppe populations, these groups are highly variable in ancestry, as well as usually having substantial minority East Asian components. The Indian groups with a lot of steppes, such as Jatts and Northern Brahmins, lack this.

There are two objections. The weaker one is that they didn’t have statistical power to detect the admixture. I haven’t run simulations, but I’m sure they have. If you have Jatts who are perhaps more than 30% steppe they would have detected trace East Asian (as you can find in many Muslim individuals from Pakistan).

The stronger objection is that there is unsampled structure on the steppe, and groups without East Asian admixture that are direct descendants of the Sintashta without dilution. This is not entirely unreasonable or implausible, though at this point I’d say this is unlikely for two reasons:

  1. Central Eurasia is pretty well sampled due to interest and conditions
  2. The steppe ancestry in South Asia is pretty widespread. Hard to imagine it percolating so far in 1,500 years

Also, the statistical tests I’ve done show Bengalis got East Asian admixture 1,500 years ago. 10-20% of the ancestry. The steppe percentage in Bengalis is 10-15%. But I never get any hits using older less sensitive methods of admixture. That means that it has to be way older a mix than 500 AD.

Stark Truth About Aryans: a story of India (part 1)

My Substack piece is up, Stark Truth About Aryans: a story of India. I’m pretty proud of this, as it wasn’t a single-sitting blog post, but something I worked over several times. Since it’s for paid subscribers I’ll post the first few paragraphs below, with an infographic that I think illustrates a lot of what’s going on.

Continue reading Stark Truth About Aryans: a story of India (part 1)

The massive Indian migration to Southeast Asian


Over at my other weblog I put up a post, Indian Ancestry In Southeast Asia Is Older Than Statistical Genetic Tests Suggest. If you look at two populations in Southeast Asia and find one has Indian ancestry you often can’t find the admixture older than 1000 A.D. (in peninsular Malaysia there is more recent intermarriage between Muslim Indians and Malays too). This seems far too recent. My explanation is simple: these dates reflect the assimilation of a hybrid Indian-Southeast Asian population across much of Southeast Asia. I have done the analyses myself, and in Cambodia, I get dates around 1000 A.D. Cambodia is not close to India and there isn’t evidence of a large Diaspora in recorded history. But, we know that Hinduism was a major influence in the region, and the Vietnamese Cham are still predominantly Hindu.

The kingdom of Funan, known mostly from Chinese accounts, flourished in Cambodia for the first five centuries of the common era or so. There is an inscription in Sanskrit from the region dated to the 5th century A.D. that refers to the moon of the Kauṇḍinya line (… kauṇḍi[n]ya[vaṅ]śaśaśinā …) and chief “of a realm wrested from the mud”. The text is in the Grantha script.

Further west, Dvaravati also had a strong Indic influence, no later than the 5th century A.D.

The genetic results indicate on the order of 10-20% of the ancestry of people in central Thailand is broadly Indian. This is not a trivial fraction. Who were these people? How early did they come?

On a minor editorial note, I’ll observe there is lots of discussion about possible Indian gene flow to the north and west (into Iran and Turan), but the data on Southeast Asia is clear and of greater magnitude. But there is far less discussion and exploration of this.