YFull has amassed an extensive database of modern and ancient Y-DNA samples and computed time estimates for the formation of various branches. This post will outline some important observations from looking at the page for R1a-Z93, which arose in the steppe and became the dominant Y haplogroup in Fatyanovo culture and its descendant Abashevo and then the Sintashta culture, from which steppe ancestry in India and Iran ultimately originates.
Indic Lineages
Examining lineages with samples found almost exclusively in South Asia is instructive, especially when we look at formation dates of the lineages.
Dates
Consider the famous R1a-L657. The estimated formation date and TMRCA for it is ~2100 BCE.
That could of course be a fluke defined by some sort of later founding effects. But we can consider the other major subclade of R1a-Z93, namely R1a-Z2124. This is sometimes assumed to be an Iranic clade, but it contains Indic sub-branches as well, such as R-YP523 with a formation time of ~2100 BCE and TMRCA of ~1700 BCE, and R-Y46 and R-Y43743 both with formation time and TMRCA of ~1800 BCE. A little messier but also likely originally Indic is R-Y37 with a formation time of ~2500 BCE and a TMRCA time of ~2000 BCE.
Based on these, we can estimate that the main steppe migrants into India branched off from the rest of the steppe ~2100-1800 BCE. A plausible scenario is that at the same time that many Sintashta clans were spreading out and establishing the cultures of the Andronovo horizon as well as related cultures like Tazabagyab, one or more clans chose to travel past them into the Hindu Kush and the Indian subcontinent, contributing the bulk of steppe ancestry seen in modern day Indians. This is backed up by the fact that the Indian lineages seem to split off close to the time of expansion of R1a-Z93 into star like phylogeny, which is a tell-tale sign of rapid population expansion and migration of the kind observed archeologically in the Andronovo horizon.
Indic Samples in Arabia
What’s interesting is when we take a look at sublineages of these that have ended up in the Arabian peninsula, where modern Arabs have apparently been enthusiastic adopters of genetic testing services. There are some lineages from recent centuries (especially related to Pakistani lineages), but also many that go back significantly further.
These include one with formation time ~1800 BCE and TMRCA ~450 CE, another with formation time ~900 BCE and TMRCA ~1050 CE, one with formation time ~1100 BCE and TMRCA ~1450 CE, and one with formation time ~200 BCE and TMRCA 1450 CE.
Likely all of these (or at the very least the first two) entered the Arabian peninsula via medieval, early historic, and protohistoric Indian Ocean traders. These samples are found across Arabia, including in Qatar, Bahrain, Oman, and western Saudi Arabia. In fact, we can see that there is some AASI admixture (using Irula as a proxy) in some modern populations in the Arabian peninsula.
Iranic Lineages
Russians
Unfortunately, very few samples from Iran are available, and they do not allow us to make meaningful conclusions. Fortunately, we have many samples available in Russia that are surviving descendants of Sintashta / Andronovo lineages, both in Z2124 and R1a-Y3 (from which L657 descends on the Indian side).
Both R-Y75187 and R-S23592 exhibit a mix of old (dating as far back as Andronovo) and modern samples in the Central Steppe in Turkic-speaking places like Bashkorstan, Tatarstan, Kazakhstan, and Kyrgyzstan. Another that shows modern samples in Russia (as well as an Azeri Iranian sample) is R-Y38987.
It should be noted that these R1a lineages are cannot be from any outflux from India – both their ancient and modern samples remain in the steppe, and no samples in these lineages are found in India. Additionally, the lineages were not spread by Buddhism, as the various Kipchak Turks such as Bashkirs and Tatars that settled this region of Russia practice Tengrism (or previously practiced Tengrism for those that have since adopted Islam), and never practiced Buddhism, unlike Turks and Mongols further to their east who adopted Buddhism.
Iranic Lineages in Arabs
What about Iranian influence in and near the Arabian peninsula? Well, R-BY149647 has both an Andronovo and modern Russian steppe sample as well as descendants in Arab countries. R-F1417 shows a lineage in Russia, another lineage in Arabs, as well as a sample in an Iranic tribe in ancient Kyrgyzstan.
Notably though, nearly all these samples are located in Kuwait or Ash Sharqiyah Province of Saudi Arabia. These likely originate with Iranian settlement and rule of modern day Kuwait under the Persian Empires. As such, though we don’t have many direct samples from Iran, neighboring Kuwait provides a useful source of data for Iranian lineages. Iran was not historically known as a major participant in the Indian Ocean trade, and this distribution of Iranic lineages appears to reflect that.
Conclusion
In summary, we’ve found that Indian R1a lineages from the steppe date to the early 2nd millennium BCE, consistent with a migration into India simultaneous with the Andronovo horizon. We’ve also found that we can observe surviving Iranic lineages in the Central Steppe in Russia, and that Iranian lineages to Kuwait and Ash Sharqiyah by land, while Indian lineages appear in the entire Arabian peninsula via ancient and medieval sea trade.
//the Indian lineages seem to split off close to the time of expansion of R1a-Z93 into star like phylogeny,// true. the question is where could this population explosion have happened? given //Consider the famous R1a-L657. The estimated formation date and TMRCA for it is ~2100 BCE.// and not found in the aDNA of Andronovo region at that time, this could have only happened within Indian subcontinent, or not?
Either it was formed in India, or more likely it was formed in the steppe but all the sons / grandsons of the man who got the mutation ended up in the clan traveling to India, so none of the lineages were left behind on the steppe. Basically founding effects.
It does not make sense, when successful founding effect lineage does not leave a trace where it was actually founded and flourished. There must be a reason and favourable circumstance for that success story. Mature Harappan civilisation is most likely the natural place where this population explosion had happened.
I mean L657 arose in a less common (Y3) lineage in the steppe, and clearly simply belonged to a clan which migrated to India (and maybe to Xinjiang – it’s hard to say for sure whether L657 lineages got there directly or got there later via India). It’s not that surprising not to have any left behind in the steppe. We still see tons of sister and cousin lineages surviving in the steppe.
As to population explosion, the primary sublineage, R1a-Y9, clearly exhibits star-like phylogeny around ~1800 BCE. R-Y9 and its descendant lineages and sublineages R-Y7, R-Y30, R-Z29113, and R-Y29 all have estimated TMRCA of ~1800 BCE, which is suggestive of an expansion *after* the mature Harappan. https://www.yfull.com/tree/R-Y9/
This is to be expected – the Mature Harappan was too populous to experience such a rapid and dramatic founding effect from any lineages that may have slipped in from the steppe.
//the primary sublineage, R1a-Y9, clearly exhibits star-like phylogeny around ~1800 BCE// true, this is not just the whole story.
Their ancestors R-Y27 and its immediate descendant lineages (R-L657,R-M605, R-M28, R-Y4 and R-Y9) were formed around ~2100 BCE and exhibits star-like phylogeny.
The population explosion this R1a subclade should have happened across few centuries for obvious reasons.
Also “Ding et al 2021 suggest that the mutation rate for R1a subclades is lower than for other haplogroups, therefore suggesting older divergence dates.” so most likely this explosion should have happened further back time right into the mature harappan era well inside Indian subcontinent, given lack of R-Y27 aDNA outside Indian subcontinent during this time. ( assuming unpublished Abashevo R-Y3+ samples turned out to be true)
This R1a migration did not bring steppe ancestry because 1) DATES results show later dates of steppe admixture into India, not in this timeframe 2) migration was not massive in scale because few families or even single male could have been enough.
Remember in order for Aryan migration to be true, both R1a & steppe autosomal ancestry should have come to India together and in massive scale.
But as seen earlier, R1a came to India in a trickle most likely. Steppe ancestry got introduced very late. both have not much to do with the IE language introduction according to Heggarty et. al. 2023 linguistic conclusion. And we have almost nil archaeological evidence of massive migration as well.
Again, the TMRCA time of the star-like phylogeny in Indian R1a-Z93 lineages (including but not limited to L657, which shouldn’t be exclusively focused on) is close to or shortly after the TMRCA time for star-like phylogeny of lineages that stayed in the steppe and started showing up in Andronovo Culture. The Andronovo horizon provides a reference against which we can calibrate Indian lineages.
And Y3+ samples showing up in Abashevo is unsurprising given that the inferred TMRCA for Y3 is 2500 BCE – clearly predating Abashevo culture.
thanks for the nice article. There is really not enough attention put into Indic R1a and especially R1a-L657. I am working on getting more R1a-L657 samples from my ethnic group. It is probably the biggest clade R1a today in terms of absolute numbers that formed around 2000 B.C but we dont have much private or public research. Instead we see lot of crazy stuff about it being from Iran Neolithic, BMAC or whatever.
I am also waiting for the Harvard Abashevo samples to finally see more ancient R1a-Y3+ (one of the researchers mentioned the presence of Y3+)
More R1a-L657 samples would be great!
Yes, I also hope the Abashevo samples get published soon. Reportedly there are a couple that are Y3+.
For now from the ancient steppe we only have I6561 which is Z2479+, Z95+, and Y26+, but Y2- so it would appear to be a basal R1a-Y3 sample.
//Again, the TMRCA time of the star-like phylogeny in Indian R1a-Z93 lineages (including but not limited to L657, which shouldn’t be exclusively focused on) is close to or shortly after the TMRCA time for star-like phylogeny of lineages that stayed in the steppe and started showing up in Andronovo Culture.//
Poznik et. al. 2016 provides the basis clearly for this separation i.e. between Steppe and Indian lineages, Fig. 4 especially.
Based on the TMRCA given in the YFull, I have added the labels.
https://postimg.cc/qz7Cqsfh
Given the paucity of L657/Y9 aDNA samples in the Steppe area during BA/IA and also the modern era, their expansion most likely should have happened in the Indian subcontinent.