Why Scythians, Sakas, and Kushanas, are NOT the source of “steppe” ancestry

This is a common question/assertion in the comments pretty much every other week: why couldn’t the documented incursions of nomadic people in the first millennium A.D. be responsible for the steppe ancestry? There is actually a good explanation in The formation of human populations in South and Central Asia, so I’ll quote it:

By the Late Bronze Age, ESHG-related admixture became ubiquitous, as documented by our time transect from Kazakhstan
and ancient DNA data from the Iron Age and from later periods in Turan and the Central Steppe, including Scythians, Sarmatians, Kushans, and Huns (29, 52). Thus, these first millennium BCE to first millennium CE archaeological cultures with documented cultural and political impacts on South Asia cannot be important sources for the Steppe pastoralist–related ancestry widespread in South Asia today (because present-day South Asians have too little East Asian–related ancestry to be consistent with deriving from these groups), providing an example of how genetic data can rule out scenarios that are plausible on the basis of the archaeological and historical evidence alone (13) (fig. S52). Instead, our analysis shows that the only plausible source for the Steppe ancestry is Steppe Middle to Late Bronze Age groups, who not only fit as a source for South Asia but who we also document as having spread into Turan and mixed with BMAC-related individuals at sites in Kazakhstan in this period. Taken together, these results identify a narrow time window (first half of the second millennium BCE) when the Steppe ancestry that is widespread today in South Asia must have arrived.

There is now a large database of Scythian, etc., ancient DNA, thanks to the preservation conditions on the Eurasian steppe. Most of their ancestry derives from the same broad group as the Andronovo horizon of which the Sintashta were part. But, unlike the earlier steppe populations, these groups are highly variable in ancestry, as well as usually having substantial minority East Asian components. The Indian groups with a lot of steppes, such as Jatts and Northern Brahmins, lack this.

There are two objections. The weaker one is that they didn’t have statistical power to detect the admixture. I haven’t run simulations, but I’m sure they have. If you have Jatts who are perhaps more than 30% steppe they would have detected trace East Asian (as you can find in many Muslim individuals from Pakistan).

The stronger objection is that there is unsampled structure on the steppe, and groups without East Asian admixture that are direct descendants of the Sintashta without dilution. This is not entirely unreasonable or implausible, though at this point I’d say this is unlikely for two reasons:

  1. Central Eurasia is pretty well sampled due to interest and conditions
  2. The steppe ancestry in South Asia is pretty widespread. Hard to imagine it percolating so far in 1,500 years

Also, the statistical tests I’ve done show Bengalis got East Asian admixture 1,500 years ago. 10-20% of the ancestry. The steppe percentage in Bengalis is 10-15%. But I never get any hits using older less sensitive methods of admixture. That means that it has to be way older a mix than 500 AD.

48 Replies to “Why Scythians, Sakas, and Kushanas, are NOT the source of “steppe” ancestry”

  1. Do you think that the jats got their steppe ancestry from the original indo-european migration or something later (before east asians started mixing with central asians)

    1. Jats have direct DNA matches with Scythian burials in Ukraine, Russia, Kazakhstan Altai unlike most other South Asians and have the most Steppe DNA. I have taken a DNA test and have South Korean, Chinese and Japanese ancestry and my surname is Kang who originate from Afghanistan and most likely descend from the Saka/White Huns of Central Asia.

        1. I’m not wrong and everything I’ve written is known already for a long time and proven genetically. I’m a Kang Jat and we’re not of Indian origin and trace our origins to Ghazni Afghanistan like many from NW India.

  2. not sure. the second is not a crazy idea but i found no evidence when i looked

    perhaps i should look harder.

    if the admixture is late enough it should show an LD decay signal and i don’t see one. but they are definitely more steppe i think than even UP brahmins or pathans

  3. Indian literary sources have the following peoples identified in their geographical habitats to India’s northwest and north (in a rough chronological order)

    Before 1000 BCE
    Mleccha – unknown?
    Uttara Kuru – Tarim Basin/Xinjiang – Modern “Tocharians”
    Madra – In the region of modern Afghanistan, below the Chaksu (Oxus) region
    Uttara Madra – Regions beyond Madra
    Bahlika – Bactria
    Huna – from the Chaksu river (Oxus) – also known as Huns
    Parsika – Persia

    After 500 BCE
    Yona – unknown?
    Yavana – also attested to Greeks
    Sakas – Distinct from Persians (Parsikas) or Scythians
    Kambojas – also a sub-clan of Sakas

    They were distinct from “us” – that is very clear. One of the texts that is observant of this difference – Pandu in Mahabharatha wants to marry Madri, sister of Shalya – the king of Madra. Shalya asks for dowry, which is the custom there. Pandu is taken aback at this reversing of custom. But he agrees eventually and pays the dowry.

    There is definitely unsampled structure in the regions beyond Helmand and Oxus rivers. There is distinct nomenclature for many groups in attested literature.

  4. Great article, but Scythians, Hunas, Sakas who entered NWest of Ind Subcontinent are supposed to be Indo-Iranian group not someone with East Asian ancestry. I don’t think their genetic make up was any different to Indo Aryan, it is just like they came at some later age.
    Bengalis don’t have any Scythian relation, as Saka etc had not reached to that region.

    1. Great article, but Scythians, Hunas, Sakas who entered NWest of Ind Subcontinent are supposed to be Indo-Iranian group not someone with East Asian ancestry. I don’t think their genetic make up was any different to Indo Aryan, it is just like they came at some later age.

      you are engaging in circular reasoning. indo-iranian ancestry does not mean no east asian ancestry.

      the people sampled WERE sakas, scythians, hunas. these people had a long and early existence on the steppe before they arrived in south asia. all of them have substantial east asian (siberian) admixture, probably due to mixing with proto-turkic groups in the altai (where r1a is at high frequency).

      again, there has been a lot of analysis the steppe iranian groups. all of them become east asian mixed during the iron age, probably as a preface to later turkic migrations west. it could be that somehow the ones who went to south asia centuries later were different and isolated somehow. but that’s not parsimonious (though possible).

  5. “Central Eurasia is pretty well sampled due to interest and conditions
    The steppe ancestry in South Asia is pretty widespread. Hard to imagine it percolating so far in 1,500 years”
    Nobody is saying all steppe ancestry is from Saka, Huna, Kushanas. But there is a possibility in groups of NWest who show high steppe ancestry to be a result of Sakas, Hunas, Kushanas, Parthians etc. The Scythians or Sakas, Hunas who entered India are supposed to be different to Central Asian ones. These were basically groups who lived in East Iran areas and later got pushed towards Subcontinent. I don’t think they would be carrying any East Asian ancestry.

  6. “By the Late Bronze Age, ESHG-related admixture became ubiquitous”

    So Turks were moving westward long before the disintegration of Xiongnu empire (around 90 BC)?

    It seems proto-Turkic split from proto-Turco-Mongol branch of Transeurasian language around 2800 BCE. These Turks must have moved westward in subsequent period.

    “Robbets(2017) dated the split of proto-Turkic from earliest proto-Turko-Mongol language to 2800 BC”
    The Oxford Guide to the Transeurasian Languages 2020 pp-755

    Turks and their ancestors seems to be initially agricultural based population based around Northeast China and associated with some neolithic culture. It is based on recent linguistics studies.
    Sino-Tibetians were mountain wandering group situated in Nepal, NE India and southern Tibet. Turks were the original inhabitants of Neolithic North China.

  7. In the same paper, Narasimhan et al. found Kangju who had low to negligible East Asian ancestry. They were also known as White Huns. In Indian texts, they are mentioned to have invaded India. And there we have it — the people who could have contributed Steppe ancestry in India.

    1. They were also known as White Huns. In Indian texts, they are mentioned to have invaded India. And there we have it — the people who could have contributed Steppe ancestry in India.

      the dates are so late. they’d show up in ALDER. never do. what’s up with that?

      (again, my east asian at 500 AD shows up quite easily)

  8. @tpot and rs
    The extra steppe couldn’t be because of Kangju either. I have gone over this in the past. The non kangju ror component has a lot more steppe than non kangju brahmin component. There could be low level admixture of this type in many castes but it certainly does NOT explain the extra steppe in rors.

  9. @Razib, @DaThang,

    Are there additional samples and studies about current Indian population castes after Moorjani et al. (2013)? I found Mallick et al. (2016) and Narasimhan et al. (2019) referencing more population groups (214 ~ twice as much from Moorjani). But one thing I found from recent Razib’s questions on caste endogamy is that there are about 110 different SC and OBC castes within Andhra Pradesh. While Moorjani et al. pick 7 to be representative with 2 SCs, 1ST, 2OBC, 1 UC-sudra, 1 UC-non-sudra in Andhra Pradesh.

    Given sub-definitions within Velama and diversity of Naidu label, I just feel like it is too little to really understand what is going on. It is also strange that Madiga (SC in South) gets ANI (can be used as proxy for steppe) around 1500BC according to Moorjani et al. (leaving about 300-500 years from NW steppe entry to all the way to Deccan dalits), but we are wondering if 1500 years is enough from Sakas et al. to be widespread within the subcontinent?

    1. FYI the communities like Jats etc are not found in any east region after West U P. Yes From NWest till West UP. This creates enough doubt about they being new entrants to Indian society
      @Razib @Violet

  10. gets ANI (can be used as proxy for steppe) around 1500BC according to Moorjani et al. (leaving about 300-500 years from NW steppe entry to all the way to Deccan dalits), but we are wondering if 1500 years is enough from Sakas et al. to be widespread within the subcontinent

    the moorjani paper is out of date. don’t use those numbers. the madiga shouldn’t have much steppe at all, all their ANI should be IVC

    also, no reason why they have to be where they are now 3,500 years ago.

    by 500 AD a lot of the groups were in place since we got records

    1. I understand Moorjani is out of date but there isn’t as clearly noted comparison in latest versions. I searched for Narasimhan et al but it isn’t as easily split to get direct dates or split percentages.
      In any case, how much Steppe in Madiga isn’t the question, it is when the Steppe did get into Madiga. It appears it happened quite early.

      If we are going to say that South Indian SCs with high AASI moved from NW India to Deccan, between 1500BC – 500AD, then why not call the whole AASI intrusive from out of subcontinent too (e.g., from SE Asia via seafaring, not too far-fetched if Andamanese are related to Amazon tribes). I thought the idea was that AASI stay-put and Steppe came into wherever AASI were around.

      1. In any case, how much Steppe in Madiga isn’t the question, it is when the Steppe did get into Madiga. It appears it happened quite early.

        you need to engage in some reading comprehension. the ANI != steppe. i made that clear.

        the groups like madiga get their ANI from the IVC. do i make that clear?

        If we are going to say that South Indian SCs with high AASI moved from NW India to Deccan, between 1500BC – 500AD, then why not call the whole AASI intrusive from out of subcontinent too (e.g., from SE Asia via seafaring, not too far-fetched if Andamanese are related to Amazon tribes). I thought the idea was that AASI stay-put and Steppe came into wherever AASI were around.

        most of the west eurasian ancestry in south asia is not from the steppe

        1. See my comment below. I wasn’t saying ANI == Steppe, but from your own graphic, ANI = Steppe + IVC. Sorry to be testing your patience.

        2. Problem with Aryan Theory is Lack of origin . Their is no archaeological evidence which suggest/indicate large presence of Indian culture outside Subcontinent . You can’t create outside culture origin theory if you cant find its outside origin . In such case theory does not even stand Very basic scrutiny . … Only basis of such theory would be our racist mindset developed over the time which thinks that ‘everything came from white People’ .
          obviously Mixing predate Saka & Kushans but …What Saka ,Kushan history tell us that lot of central asian tribes mixed in india but they accepted local superior culture & did not bring anything substantial influence of their native place/culture on India .

          1. It seems you are a bit confused. You say that ‘Aryan Theory is Lack of origin’ but immediately you say that they came from Central Asia, that they (Saka and Kushans) are ancestors of some people in India while their ancestors are ‘Aryans’. It means that those ‘Aryans’ have origin (in CAsia)? But, they did not bring anything and “accepted local superior culture & did not bring anything substantial influence of their native place/culture on India”. What about language (Sanskrit) and epics/scriptures written in this language? What do you know about mythology outside of India (e.g. in today’s Europe) and are you sure that is nothing there similar to Hinduism?

          2. You are confusing central Asian mixing with cultural aspect . Central Asian mixing is always been case for centuries central Asian tribes came to India . Word “Aryan” is used to denote pioneer of Hinduism , there is no hard evidence of any kind that suggest Indian culture origin is outside . Quite opposite ancient Indian considered people beyond Afghanistan as “uncivilized barbarians” , Why would they do that ? if their ancestors came from central Asia ?. Language , Knowledge , Math , numbers spread like Chain reaction ..more possibility of small scale trade & migrations from India to outside – opposite . But No tribe even in western India with highly mixed from central Asian gene pool trace themselves “culturally” to Central Asian or any other part of world ..

          3. There is something new in your comment. Before recent synergyisation into IT we had AMT (Aryans came from somewhere to SAsia) and OIT (people left today’s India and populated Europe, Russia, North America, etc). Now, we have both, AMT+OIT. Brilliant. No one could remember to propose this. All happy. It was not so obvious (to former OIT) that anyone could come to India. How ancient Indian knew people beyond Afghanistan and how they knew that they were barbarians? Who executed this chain reaction? Any evidence (genetic, any) of outbound migration? It seems that we all settled on genetic input to SA but the assertion is that ‘barbarians’ came without any culture and holding only dicks in their hands. The simplest resolution is to explain the language. Where Sanskrit originated and how influenced almost all Euro languages, who created Rg Veda? Re: Rg Veda, did you the get vine message in Razib’s part#1 (In vino veritas!)? If you did not, go back to the drawing board, check the cabling, clean your data or change your program compiler.

            For you AK-47, there is one example of cultural synergy you were probably talking – Aryan/Gypsy (Serbian) orchestra performing a chain reaction concert in Poland (in 97, not in 47) in Roma language and chanting your name (Sereno will love it).

            https://www.youtube.com/watch?v=UqOL7LOR6ko

          4. I think the areas of Europe , Central Asia Where high influence of Sanskrit exists today Probably People in that area had Indian ancestors . Likely Some Indian migration happened & They established themselves in those pockets , later they mixed with rest of Europeans or central Asians . Just like Indian Numbers took over Roman numbers due to its sophistication . These People had immense linguistic impact on rest of Europeans , Rest of Europeans took Sanskritic from these people , mixed with Primitive languages of Europe & that’s birth of Hybrid Indo-Europeans languages .

          5. You are a real discovery I would say the best hidden BP secret. I hope you liked the above A&G music with chanting your surname. You are giving us insights in the former OIT (RIP) theory which usually only criticized AMT but kept their cards close to their chests and did not give much away except stories about elephants and mouses.

            It is very interesting your discovery about Sanskrit pockets in Europe and elsewhere before diffusing to the rest of the world. I was wondering how, for e.g., Serbian language (and mythology) is so similar to Sanskrit but no other surrounding languages (Hungarian, German, Romanian, Italian, Albanian, Greek, French, etc.). It means that one (maybe the only) Euro pocket was in today’s Serbia? What is more interesting, this pocket probably consisted of descendants of CAsian barbarians (because they have the same genetics), who came to India, took the knowledge, culture and maths and went back to the pocket to later spread all this around the Europe. Razib would probably call them (anti)reflux wave. Their influence was probably so strong that locals have forgotten their language which they spoke for almost 10000 years (at least since Ice Age) before Sanskrit came to Europe (assuming that it was 4000 years ago). It would be interesting to estimate how much time (proto)Sanskrit needed to develop and where this happened? You may tell us in one of your future comments. On Razib’s behalf – please do comment more often.

          6. There is no lack of clearity at all.

            The migration trail is as follows:
            Fatyanovo Culture(Russian Forests)>Sintashta Culture(Uralic Steppe)>Andronovo Culture(Kazakh Steppe)>Vakhsh Culture(Tajik Highlands)>Painted Grey Ware Culture(North West India.
            Got it?

          7. Ya , i did watch the video . It was great , Feels good sharing roots of culture in other part of world. I think ur “reflux wave” or Pocket Theory is quite interesting . it does make sense considering Indian north west high on central Asian gene , First influx in India mixing with local culture then this mix arrive into Europe/central Asia … Serbia , Romania region Probably the Pocket , Their Dominance over other Europeans probably Sanskratized the whole region , I think language transformation is quite fast . Time period would have been quite short , considering how fast today languages are changing …. It was a good talk & great insight , thanks .

          8. No worries, glad you like the people cheering you. In fact, there was not Sanskritized the whole region, only Serbs but I think because Serbs were very smart while all around were so dumb. Not really dumb, they actually did not exist at all at that time, Serbs were very lonely in their pocket and all other came thousands of years later. Stay cool!

  11. I think that the Vedic era ancestors of Brahmins and Kshatriyas would have been similar to Jats and Rors. This is after the earliest south Asian admixture with Indo-Aryans. The ancestors of Brahmins and Jats + Rors would have belonged to different societies. The Brahmin Vedic society later mixed with more local elements, but since ancestors of Jats and Rors weren’t a part of Vedic society, they didn’t get the same kind of mixture. All of this would have culminated by 2000 years ago at most.
    ^ Could this hypothesis be checked/ruled out using modern samples Razib?

    1. In that case we should find archaeological evidence of ‘Brahmin’
      Hindu origin in central Asia . Do we find something like Ajanta Caves somewhere in Certal Asia ? We don’t . — Lack of Origin — A Basic scrutiny destroy Aryan theory .

      What Saka & Kushan tell us that central Asian Tribe accepted local Superior culture … they kind of abandoned their native culture , which might happened with ‘Aryans’ as well & They were nothing but most likely ancestors of Scythian , saka .

      1. The Rig Vedic People never mention building caves or temples etc.
        Even today,the Mandap for Atiratra Somayaga as used by the pure Vedic Nambudiri Brahmins is STILL a temporary enclosure,to be given to the Lord Agni at the end.
        But Regarding Vedic Materials,we indeed have found a lot of Chariots,Ashvamedha Yagya Sacrifices,Fire Pits etc. in the Central Asian Steppe.

    2. Pure joke, Most of these North West groups are located where constant migration of Parthians, Sakas, Hunas, Kushanas happened. It is almost impossible to what you are suggesting. On the other hand, there is a high chance of these groups mixing with incoming North West population.

  12. I dont’ know how to paste tables or infer qpAdm values, but Narasimhan provides these, and Steppe looks about the same to me between SCs and Reddy/Kapu UCs in Deccan. The last three values are AHG, InPe and Central-Steppe-MLBA, in that order in each row. Note that dates for mixing for other groups are aligned with ANI mixture dates provided by Moorjani et al. but not provided for Madiga.

    Brahmin_Vaidik 27 106 ± 7 1015 BCE (1394-636 BCE) 0.003 0.306 0.540 0.154
    Reddy_Telangana 8 104 ± 12 974 BCE (1655-294 BCE) 0.002 0.365 0.560 0.075
    Pattapu_Kapu 5 103 ± 17 947 BCE (1917 BCE – 45 CE) 0.003 0.415 0.539 0.046
    Madiga 3 .. .. 0.005 0.478 0.466 0.057
    Mala 13 117 ± 9 1323 BCE (1817-829 BCE) 0.070 0.503 0.433 0.064

  13. Note that dates for mixing for other groups are aligned with ANI mixture dates provided by Moorjani et al. but not provided for Madiga.

    no power based on sample size

    1. Mala samples are 13, and from our last discussion on sample size, I thought you mentioned about 5 samples are plenty to infer this? Perhaps you mean proportions and not dates?

      Just to be clear, Mala are the other SCs with same caste hierarchy as Madiga. Therefore, I am treating them about the same for inference.

  14. Mala samples are 13, and from our last discussion on sample size, I thought you mentioned about 5 samples are plenty to infer this? Perhaps you mean proportions and not dates?

    yeah, i think you should treat them same.

    two points

    1) i believe these are from SNP array (100,000 markers not 10 million). that makes a difference
    2) for ALDER based estimates you need to compute LD stat and that needs a non-trivial number of individuals. proportions are measured against a reference panel. so 1 is fine. but LD is measured WITHIN the *population*

    1. #2 above, noted!
      I was thinking the same given their 95% confidence intervals. It is much tighter when sample size is above 20 individuals.
      But one interesting thing to note is how all the Andhra populations show the admixture event with mean of around 1000BCE, irrespective of caste status. Even with small sample sizes, as a whole, it doesn’t seem to be a bad estimate that Steppe arrived to Andhra region by 1000BCE. Particularly we know Lambadi (known nomads from Rajasthan area) have a mean later date than local groups, and not counting them.

      Whatever happened between 2000BCE-1000BCE, oh man, it was interesting!
      I don’t know what’s up with Vysyas though, with lowest steppe qpAdm values out of this group and earliest date (with lower bound pushing close 2000BCE) and known strict endogamy.

      Brahmin_Vaidik 27 1015 BCE (1394-636 BCE)
      Panta_Kapu 18 1453 BCE (2140-765 BCE)
      Reddy_Telangana 8 974 BCE (1655-294 BCE)
      Lambadi 5 652 BCE (1911 BCE – 943 CE)
      Vysya 43 2310 BCE (2682-1937 BCE)
      Naidu 8 1093 BCE (2097-88 BCE)
      Bestha 5 1414 BCE (2438-391 BCE)
      Pattapu_Kapu 5 947 BCE (1917 BCE – 45 CE)
      Yerukali 7 909 BCE (1780-38 BCE)
      Mala 13 1323 BCE (1817-829 BCE)

  15. Maybe Jats are refugees from Central Asia fleeing a Turkic or Proto-Turkic westward expansion?

    Explains why they have no East Asian and their position in the traditional caste hierarchy. India has been a landing spot for all sorts of refugees since time immemorial.

    They worked as labourers and like their ancient ancestors eventually took over the place. Jai shree Indrapreet!

  16. Just a speculation: Looking at the naming”Jats and Rors” Saka/Sarmatian (non-vedic group) living in Afghanistan.
    I have a language(distinct) where we call one -set of our fathers”Matti/Mati/Mattido.”
    Vedic people call “Pati/Poti for fathers.
    Just for information/reading(don’t share with common Folk): If you guys have any coin collections, please contact your archeologist

    1. Aren’t Saka and Scythians synonyms? Sarmatians were Serbs. ‘Mati’ in Serbian is ‘mother’. Guys with (bit)coin collections, please contact me.

  17. “ Taken together, these results identify a narrow time window (first half of the second millennium BCE) when the Steppe ancestry that is widespread today in South Asia must have arrived.”

    What does evidence say in terms of extent of steppe ancestry in Turan and Swat during this period? Was that enough to change trajectory of cultural and genetic make up we see in modern India?

    I understand that steppe ancestry in swat up to historic times was very disappointing. May be future samples would change this. But for now, wouldn’t it be possible that initial steppe ancestry was limited but extent of current spread is result of growth after limited steppe ancestry integrated with IVC. They were relatively more successful as India was populated over next 2000 years. That would also be inline with cultural continuity from IVC

  18. I’m continuously baffled by how “pure” these steppe groups remained despite how far in time and space they drifted from the original source.

    Early CWC was in the Baltic, they exapnded to the rest of upper Europe afterwards and developed a relatively consistent genetic profile of 70% Yamnaya and 30% EEF across that territory. Fatyanovo-Balonovo in Russia was essentially identical to Sintashta which was living just outside the border of Europe(but still very far from the Baltics).

    Most of the MLBA Kazakhstan sites(Andronovo?) appear essentially identical to Sintashta as well. Even the far flung Krasnoyarsk MLBA in central Siberia appears overwhelmingly Sintashta/CWC like with just a tad bit extra West Siberian Hunter gatherer ancestry.

    Then you’ve got Dashty Kozy in Kazakhstan which was 85-90% MLBA steppe. On the northwestern corner of Eurasia, the Scottish and Irish are 50% Yamna, similar to Norwegians and Russians, despite being far removed from the location of the original CWC.

  19. Wow, I’m really impressed with how carelessly the authors excluded the possibility of Scythian admixture based on a generalization of “East Asian admixture” among a relative handful of samples. They’ve overlooked the obvious objection that it’s likely that NOT ALL Scythians/Saka had East Asian admixture (which was relatively small anyways) – and which is mostly been uncovered in the Eastern fringes of the Steppe. Being characteristically nomadic, and covering such a large span of land, it’s quite the possibility, that we are really just looking at late Scythian influence towards South Asia.

Comments are closed.