This is a post I was writing a few months back but had abandoned midway. It is in response to what Razib had argued in one of his posts. According to Razib while an Aryan Migration model, that suggests an entry of Indo-Aryans into South Asia, might not have textual and archaeological support, when looked at in a wider context, that necessitates explaining the origin and migration of all Indo-Europeans from a PIE homeland to their respective places of present or last known (e.g. Hittites & Tocharian) inhabitation, the steppe theory makes a far more compelling case as PIE homeland than an OIT.
Admittedly, we haven’t had a major attempt being made in the academia, Western or Indian, which tries to take stock of all available evidence, linguistic and archaeological, and uses that evidence to argue for the PIE origins in South Asia and the subsequent dispersals of the daughter languages to their known destinations.
It is beyond the remit of my present subject to ponder why this has been so but we may note that an elegant and solid linguistic case (1,2) for a spread of IE languages from a locus in the region of Bactria has been already made more than two decades back by Johanna Nichols. However, the linguistic community has chosen to sideline her work without a proper rebuttal.
As none other than James Mallory himself states,
All too often surveys of the Indo-Europeans eventually conclude with something on the order of ‘scholars have concluded that the most likely area of the homeland is…X’ with a brief defence of one particular solution (this type of scholarship has been going on since the late nineteenth century). In fact, we not only lack total consensus but where we seem to find something of a major school it is often formed by deference rather than conviction, i.e. linguists or archaeologists indicate agreement with a particular theory that they have not themselves investigated in any depth. This situation means that a small number of advocates—at times, very vigorous advocates—provide an assortment of homeland theories for the rest of their colleagues to comply with passively. The homeland is an interesting question but it is so difficult to resolve (we have over two centuries of dispute to prove that) and requires the application of so many less than robust means of argument that most archaeologists and historical linguists do not find it a worthwhile enterprise, at least for themselves. The last word is, therefore, far from written…(source, pg 460)
More recently, a young Russian linguist by the name of Igor A Tonoyan-Belyayev is attempting to make a fresh case for OIT through linguistics.
As far as archaeology is concerned, there is evidence of a significant influence coming from Chalcolithic Central Asia on the Maykop cultural phenomenon of 4th millenium BC North Caucasus. It may also be noted that there was a Catacomb culture on the steppe succeeding the Yamnaya in Ukraine & Southern Russia and it was characterised by a catacomb burial system which originated in the South, probably in the region between Central Asia-Eastern Iran where it was the preferred mode of burial in the Helmand & Jiroft civilizations as well as in BMAC but with its roots already in the Chalcolithic period preceding it in the region. We also have evidence of contacts between the BMAC and the Sintashta-Arkaim cultural phenomenon that came about on the steppe at the end of the 3rd millenium BC.
(A schematic of the catacomb burials at Gonur in BMAC – courtesy The Necropolis of Gonur)
Similarly, there is extensive evidence of not only the Harappans but also the Central Asian BMAC people as well as the Eastern Iranian Halil Rud civilization (known as Marhasi to the Mesopotamians) maintaining economic and social-cultural ties with the Near Eastern civilizations and the thread of this interaction reached all the way upto the Aegean. I intend to cover these topics in detail in my future posts, God willing.
Thus, there is already substantial material which can be used to create an Out of India PIE hypothesis and explain the spread of IE from South Asia to the Near East and to the steppe .
However, I would like to limit the scope of my present article to a specific set of genetic data that makes a persuasive case, from the standpoint of ancient DNA, for a southern origin of Proto-Indo-Europeans.
We may begin by noting what David Reich himself states in his book regarding where he believes the PIE likely originated.
While the genetic findings point to a central role for the Yamnaya in spreading Indo-European languages…those findings do not yet resolve the question of the homeland of the original Indo-European languages…Anatolian langauges…did not share the full wagon and wheel vocabulary present in all Indo-European languages spoken today. Ancient DNA available from this time in Anatolia shows no evidence of steppe ancestry similar to that in the Yamnaya…This suggests to me that the most likely location of the population that first spoke an Indo-European language was south of the Caucasus Mountains, perhaps in present-day Iran or Armenia, because ancient DNA from people who lived there matches what we would expect for a source population both for the Yamnaya and for ancient Anatolians. (pg 120)
Reich refers to ancient people who lived in Iran or Armenia as likely sources for both Yamnaya and ancient Anatolians. What was the genetic profile of these ancient people that made them as suitable ancestral sources for both the Yamnaya and the ancient Anatolians ?
As per the 2016 paper of Reich and colleagues on the first farmers of the Near East,
To the north, a population related to people of the Iran Chalcolithic contributed ~43% of the ancestry of early Bronze Age populations of the steppe. The spread of Near Eastern ancestry into the Eurasian steppe was previously inferred without access to ancient samples, by hypothesizing a population related to present-day Armenians as a source.
It maybe noted that all ancient Iranian samples, later than 6000 BC, as per the recent Narasimhan et al paper, have substantial levels of Anatolian Farmer ancestry. So as per Reich, a mixed population of largely Iranian & Anatolian Farmer ancestry is the likely source of the spread of Indo-European languages into the steppe. Further, in the very same 2016 paper, the ancient Anatolian Chalcolithic samples could also be modeled as having nearly half of their ancestry from Iran Chalcolithic. Thus, as per Reich, the Iranian Chalcolithic, that can be shown as a suitable admixture source for both Chalcolithic & Bronze Age Anatolians (where in LBA the IE Anatolian languages were spoken) as well as the Yamnaya on the steppe, is the most likely original source of PIE ancestry.
In a more recent paper, Reich and team revealed,
although Bronze Age Anatolian individuals have CHG-related ancestry, they do not have the EHG-related ancestry characteristic of all steppe populations sampled to date or the WHG-related ancestry that is ubiquitous in Neolithic southeastern Europe… An alternative hypothesis is that the homeland of Proto-Indo-European languages was in the Caucasus or in Iran. In this scenario, westward population movement contributed to the dispersal of Anatolian languages, and northward movement and mixture with EHG was responsible for the formation of a ‘Late Proto-Indo European’-speaking population associated with the Yamnaya complex…this scenario gains plausibility from our results..
Another paper, by Eske Willerslev and his team came to a similar conclusion estimating that all Anatolian samples from Chalcolithic to Middle Bronze Age showed as much as 40 % CHG admixture but no EHG admixture thus ruling out a steppe admixture in Bronze Age Anatolians.
Therefore the majority opinion among geneticists at the moment seems to be that the PIE homeland was likely in either Armenia or Iran based on the evidence that CHG/Iran Chalcolithic populations serve as ideal source populations for both the Yamnaya pastoralists of the steppe as well as the Bronze Age Anatolians.
A New Spanner in the Works
Though the opinion of the geneticists has moved strongly in favour of the Iran/Armenia homeland hypothesis for PIE, the latest aDNA data has ensured that even this hypothesis may need to be discarded soon or atleast reworked substantially.
This year, along with Narasimhan et al, there was another paper by Wang et al which for the first time published large no of ancient samples from Chalcolithic Caucasus including the much awaited samples from populations of the Maykop culture which is traditionally considered to have strongly influenced the formation of the Yamnaya culture on the steppe.
A few pertinent observations from this study :-
The Eneolithic steppe populations also existed in the North Caucasus Piedmont Steppe at sites such as Progress & Vonyuchka but had a slightly different ancestry profile to that of Eneolithic steppe populations from Samara & Khwalynsk. The Eneolithic Steppe populations could be modelled as admixture of nearly equal amounts of EHG and CHG components.
Similarly, we find Steppe Maykop samples from the steppe preceding the Yamnaya by a few centuries, which also show CHG ancestry but no Anatolian Farmer ancestry. The Steppe Maykop, in addition, also show an excess of East Eurasian and ANE ancestry.
The Yamnaya samples that follow, can be modelled by qpAdm showing between 80 to 90 % of their ancestry derived from the Eneolithic Steppe populations and the rest derived from the European Eneolithic Farmers with perhaps some ancestry coming from the Caucasus ( Supplementary Tables 13-19)
Below is a qpAdm modelling by Davidski of Eurogenes blog from last year, where it can be clearly seen that the Yamnaya samples from the Caucasus can be derived as much as 80 % ancestrally from these Progress and Vonyuchka Eneolithic samples.
tail prob 0.383882
Similarly, the Yamnaya samples from other locations can also be shown to be derived as much as 80 to 90 % ancestrally from a combination of Progress/Vonyuchka and Khwalynsk Eneolithic with the Progress/Vonyuchka Eneolithic samples alone contributing about 50 to 60 % of the overall ancestry of these other Yamnaya samples. Let us remember that the Yamnaya steppe population has long been argued by the likes of David Anthony and those before him as the likely ancestral group which spread Indo-European languages. It is therefore very pertinent that a majority of Yamnaya ancestry could be derived from these Eneolithic steppe samples from North Caucasus.
What also emerges very clearly from Wang et al is the fact that the CHG/Iran N type ancestry on the steppe emerged without any admixed Anatolian Farmer ancestry as can be observed in the Eneolithic Steppe and Steppe Maykop samples who show CHG/Iran N admixture but no Anatolian Farmer admixture.
An important observation is that Eneolithic Samara and Eneolithic steppe individuals directly north of the Caucasus had initially not received AF gene flow. Instead, the Eneolithic steppe ancestry profile shows an even mixture of EHG- and CHG ancestry, suggesting an effective cultural and genetic border between the contemporaneous Eneolithic populations, notably Steppe and Caucasus…whether this is the result of Iranian/CHG-related ancestry reaching the steppe zone independently and prior to a stream of AF ancestry, where they mixed with local hunter gatherers that carried only EHG ancestry.
The fact that, as we noted earlier, the Iranian populations from Iran Proper post-6000 BC and even North Caucasus populations around 4500 BC , as evidenced by Wang et al itself, had substantial Anatolian Farmer admixture effectively rules out the admixture coming from the south via the Caucasus route.
Wang et al suggest that even during the transition from Eneolithic to the Yamnaya phase, the Anatolian Farmer ancestry might not have come to the steppe via the Caucasus but largely if not solely through the European Farmer populations adjacent to the steppe on its west.
All later steppe groups, starting with Yamnaya, deviate from the EHG CHG admixture cline towards European populations in the West. We show that these individuals had received AF ancestry, in line with published evidence from Yamnaya individuals from Ukraine (Ozera) and Bulgaria. In the North Caucasus, this genetic contribution could have occurred through immediate contact with Caucasus groups or further south. An alternative source, explaining the increase in WHG-related ancestry, would be contact with contemporaneous Chalcolithic/EBA farming groups at the western periphery of the Yamnaya distribution area, such as Globular Amphora and Cucuteni–Trypillia from Ukraine, which have been shown to carry AF ancestry.
While an influence from the Maykop from south on the Yamnaya cannot be ruled out, its appears that the AF (Anatolian Farmer) ancestry may largely have come into the Yamnaya populations from European Farmer groups to their west.
This therefore, makes a major dent to the hypothesis of Reich and other geneticists that PIE originated South of the Caucasus and spread to the steppe via the Caucasus route.
The Origin of the unadmixed Iran_N/CHG ancestry
Clearly, if we go by the logic of David Reich, then it is this unadmixed Iran N/CHG ancestry and not the earlier argued Iran Chalcolithic group, which spread to the steppe and also to Anatolia, that is the likely vector of IE languages both on the steppe and in Anatolia. Therefore, a question arises – where did the Iran N/CHG type ancestry on the steppe come from if not from South of the Caucasus ?
The only option appears to be from further east. The extra East Asian and ANE ancestry in the Steppe Maykop individuals also point towards an eastern source. But then where did the Iran N/CHG on the Eastern Steppe come from ?
EEHG: 0.476 +- 0.017
Geoksyur_EN: 0.291 +- 0.031
Georgia_Kotias.SG: 0.232 +- 0.03
The Supplementary Section of Narasimhan et al makes some observations about a transition that affects the entire Steppe zone from the East European Steppe to the Mongolian Steppe.
Narasimhan et al divide the ancient samples accessed from across the wider steppe region both temporally and geographically. Temporally, they group together samples from across the steppe dating to between 3300-2500 BCE and designate them as Steppe_EMBA. They further divide the Steppe_EMBA samples geographically into Western_Steppe_EMBA (e.g. Pontic Caspian Steppe), Central_Steppe_EMBA (spread across Kazakhstan) and Eastern_Steppe_EMBA (around the Altai mountains – at the northeastern edge of the IAMC and further east into Mongolia).
The authors state regarding the western and central steppe populations,
…compared to the individuals from before the EMBA (Khvalynsk_EN and Botai.SG), the later individuals from both the western and central Steppe appear to have considerably higher proportions of ancestry related to Ganj_Dareh_N…(pg 235)
Ganj_Dareh_N is an Iranian Neolithic Farmer site from the Zagros. In other words there was a distinct shift towards Iran N/CHG type ancestry both in the Western as well as the Central steppe during the EMBA period. It was not just limited to the Pontic Caspian steppe where the supposed PIE formed as per the most influential theory.
…Mereke_MBA and other Central_Steppe_EMBA populations appear to be admixed between groups descended both from groups related to West Siberian Hunter-Gatherers and from groups with Iranian farmer-related ancestry from Turan. (pg 237)
Here the authors are clearly hinting that the Iran_N/CHG admixture atleast on the Central Steppes must have come from Turan i.e. Southern Central Asia.
As with the individuals from the central Steppe, the presence of people with Iranian farmer-related ancestry in the Altai region and Minusinsk Basin suggests that the contact with farmer populations from the south is a feature all across the Steppe, albeit with different source populations from the Steppe and Iran/Turan. (pg 241)
What we need to understand from the above quotes is that all across the wider expanse of the steppe from East to West, there was a diffusion and admixture of Iran N/CHG type admixture sometime around and after 3300 BC if not a little earlier. In the Central Steppes and Eastern Steppes this was largely unaccompanied by Anatolian Farmer (AF) ancestry. Infact AF ancestry was not present in Eneolithic Steppe samples even on the Western Steppe as we noted from Wang et al.
The expansion of the Iran N/CHG ancestry onto the Western Steppe was taken up to be the source of IE on the steppe. Now we see that this Iran N/CHG genetic expansion was not limited to Western Steppe but was spread across Central & Eastern Steppe as well. It clearly shows the markings of a major cultural expansion across vast swathes of Central Eurasia.
The most likely place from where Iran N/CHG type spread across the steppe is a place where this ancestral component existed without any AF admixture, a requirement that is only met in South Central Asia. Infact, the spread of Iran N/CHG admixture on the Central & Eastern Steppe could only have taken place via Turan.
Narasimhan & colleagues also put a lot of stress on the fact that there was no Western Steppe_EMBA admixture in the Central Steppe or in the regions further south before the 2nd millenium BC.
…the lack of evidence for substantial Western_Steppe_EMBA admixture in the late Copper Age sites in Turan (which are roughly contemporaneous with the Afanasievo culture) as well as in the great majority of individuals we analyzed from BMAC sites shows that their spread had little demographic impact on agricultural settlements to the south in the Early to Middle Bronze Age… Western_Steppe_EMBA ancestry was not only scarce in the agricultural settlements of Turan but in some of the contemporary hunter-gatherer and pastoralist cultures to its north.
On the other hand, the Central Steppe_EMBA samples are well spread out and are found even at the very edges of the Western Steppe in samples from Kumsay and Mereke and as we shall see in the Steppe Maykop samples, signalling an east to west migration on the steppe already during the early Bronze age.
Our analysis of a single individual from northwestern Kazakhstan from the site of Mereke, Mereke_EBA_Yamnaya is also significant in suggesting that admixture between Central_Steppe_EMBA and Western_Steppe_EMBA related groups was occurring as early as the end of the 4th millennium or early 3rd millennium BCE…The wide spread of ancestry related to Central_Steppe_EMBA but not ancestry related to Western_Steppe_EMBA across this region in the Early and Middle Bronze Age is also consistent with our observation of an outlier individual from the BMAC site of Gonur which did not have any ancestry related to Western_Steppe_EMBA but instead could be modeled as being admixed with ancestry related to Central_Steppe_EMBA.
This very same process of early bronze age East to West migration may also explain the presence of unadmixed Iran N or CHG admixture on the Western Steppe both in the Eneolithic Steppe groups as well as Steppe Maykop individuals.
We therefore see that there was massive influx of Iran_N/CHG type ancestry from the south, all across the wide expanse of the steppe, from Eastern Europe to the edge of Mongolia. There was little to no Anatolian Farmer admixture in this Iran N/CHG ancestry that spread across the steppe. The only place where such unadmixed Iran N/CHG ancestry existed without Anatolian Farmer admixture was in SC Asia. This is clearly evident from Narasimhan et al.
Populations from the east (ordered east-to-west as: Tepe_Hissar_C, Parkhai_EN, Tepe_Anau_EN, Geoksyur_EN, Bustan_EN, Sarazm_EN) have significantly higher proportions of ancestry related to WSHG and lower proportions of ancestry related to Anatolian farmers compared with those from the west…
The populations from eastern Iran and Turan require an additional source of ancestry from a population related to West Siberian Hunter-Gatherers. Consistent with the fstatistic patterns, we also observe that Anatolian farmer-related ancestry decreases from west to east while the West Siberian Hunter-Gatherer-related ancestry increases.
Both our proximal and distal models document an admixture cline between Iranian and Anatolian farmers that was established between the Neolithic and Copper Age periods, as evidenced by our data from western Iran where we document the timing of the arrival of Anatolian farmer-related ancestry by radiocarbon dated individuals from a time transect at Seh Gabi. This cline continues eastward into Turan with low to almost no proportion of Anatolian farmer-related ancestry in Sarazm, the population that we have that is at the end of the cline from this period. Importantly, the documentation of the west-to-east ancestry gradient provides insight into the type of West Eurasian ancestry that we might expect to be found in South Asia.
The Western Siberian Hunter Gatherer (WSHG) ancestry peaks in Sarazm in Tajikistan at around 20-25 % and is also present in Indus migrants from Shahr-i-Sokhta and Gonur and other Central Asian samples while it is absent in Western Iran. On the other hand, the Anatolian farmer ancestry which is so prominent in Western Iran is almost non-existent in Sarazm as well as in the Indus migrants from Shahr-i-Sokhta and Gonur.
And we also see that there is clearly an East to West population movement and admixture on the steppe, north of the Caspian sea which likely brought the Iran_N/CHG type admixture on the Pontic Caspian steppe. This is also proved by the fact that WSHG ancestry moves into Eastern Europe and is present in Khwalynsk_EN and in Steppe Maykop samples.
We may now run some models on Vahaduo which clearly serve to illustrate the migration of Iran N/CHG admixture from South Central Asia onto the steppe as far as Eastern Europe.
From the above, it is clearly evident that while the Eneolithic steppe samples from Progress And Vonyuchka have a CHG admixture, there is no admixture from Neolithic or Chalcolithic Iran. However, they pick up admixture from Sarazm in SC Asia, as much as about 20 %. Remember, that these Eneolithic steppe samples are the biggest source of ancestry for Yamnaya. And they are picking up 20 % of ancestry from SC Asia.
We can also see the Steppe Maykop samples as well as the Central Steppe samples of Kumsay and Mereke also have similar levels of admixture from Sarazm i.e. SC Asia. That this Iran_N/CHG type admixture must have come to the Western steppe from the East is also supported by the fact that Kumsay, Mereke and Steppe Maykop have a large proportion of WSHG ancestry (Russia_Tyumen_HG) which could have only come from the east.
Let me also show a qpAdm modelling of the above by one of my friends, Vasistha.
this is what works using Wang’s right pops + added Pinarbasi and satsurblia for more sensitivity to those 2 ancestries. Right Pops below.
EEHG: 0.476 +- 0.017
Geoksyur_EN: 0.291 +- 0.031
Georgia_Kotias.SG: 0.232 +- 0.03
As per Vasistha’s modelling, as much as 30 % of the ancestry in the Steppe Eneolithic samples could have come from SC Asia.
Khwalynsk Eneolithic, which contributes the maximum to the ancestry of Progress and Vonyuchka samples, on Vahaduo, is itself modelled as 80 % EEHG (Eastern European Hunter Gatherer) + 20 % Iran_N.
Yet it appears that this Iran_N type ancestry is more akin to CHG and little to do with Chalcolithic Iran or Central Asia.
Let us run some models on the Western Steppe EMBA samples such as Yamnaya,
Notice how Yamnaya samples from many different regions can be modelled as having derived more than 60 % of its ancestry from Progress EN, a population having 20 % of its ancestry from SC Asia.
Let us also run some models for the Central Steppe and Eastern Steppe EMBA samples,
The presence of admixture from SC Asia is clearly evident in Central steppe but from the above it appears that Okunevo Bronze Age sample does not have any Turanian admixture contradicting Narasimhan et al who argue for 15 % admixture from Iran_N into Okunevo.
An East to West movement south of the Caspian Sea
We can also see that roughly around this time, there was an east to west movement south of the Caspian sea, from SC Asia towards the Caucasus as there is clearly some admixture into Armenia Chalcolithic samples. The presence of South Asian ydna L1a is also clearly a big pointer in this direction.
Going by the presence of South Asian y-dna L1a in Armenia Chalcolithic as well as the detection of Indian mtDNA M52 in one early Chalcolithic sample from North Caucasus as well as the archaeological evidence already pointed out earlier, suggesting a possible movement from SC Asia into Maykop territory, an admixture from a Sarazm like population, as suggested above, into the Caucasus region may also turn out to be quite real.
Now one may question how Sarazm has anything to do with South Asia but let it be known that at the site of Sarazm, there was several evidences of contacts from Early Harappans including evidence of migration from Early Harappans into Sarazm.
Therefore, already in the 4th millenium BC, we can observe a massive spread of Iran N/CHG type ancestry across Eurasia, in regions which are usually associated with the presence of Indo-Europeans. Infact, top geneticists like Reich have already admitted that Indo-European languages likely spread onto the steppe with this Iran N/CHG ancestry.
Going by the evidence presented above, the most likely place from where this Iran N/CHG type ancestry spread across Eurasia is South Central Asia. So this is a question worth probing. Did Indo-Europeans spread across Eurasia from SC Asia with the massive expansion of this Iran N/CHG ancestry ?