Some admixture coefficients for South Asian Genotype Project members

I decided to run qpAdmin on a large number of the South Asian Genotype Project members. The codes should be self-evident for the individuals. The Indus Periphery samples are from the Reich dataset. The steppe is all Sintashta samples from the recent publication (I removed outliers). The Andamanese hunter-gatherers are from the Andamans.

Some of the populations are not good fits on the India cline. Adding Dai as East Asian improves the fit for the Bengali Kayastha. But it messes it up for most of the others.

Please note that these are individuals. There is going to be variance within populations.

IndividualsIndus_PeripherySteppeAHG
AP_vellala_10.5830.0650.352
Beng_Brah_10.5740.2310.196
Beng_Kayastha_10.4380.120.442
Bihar_Babhan_10.4420.3520.206
Bihar_Sayyid_10.470.280.249
Chhatt_satnami0.4530.1780.369
Guj_Bohra_Patel_10.7670.2140.018
Guj_lohanna_10.7760.1990.024
Guj_Patel0.710.1050.185
Guj_Tapodhan_Brah0.7550.1120.133
Guj_Vania_10.5340.2270.239
Guju_Brah_10.5690.2930.138
Guju_Jain_Brah_10.5450.2910.164
Guju_Solanki0.6510.070.279
High_caste_nair_10.7840.0520.164
Indian_GreatAndaman_100BP0.10.0070.893
Jammu_Dogra_Brah_10.5970.180.223
Kann_AP_Brah_10.6120.1730.215
Kann_Brah_10.5060.1130.381
Kann_Kodava_10.7420.030.228
Kash_Suniareh_10.5950.3140.091
Kashi_butt_10.5710.2750.154
Kashi_syed_10.5410.2620.197
Ker_Knanaya_10.6940.1520.153
Ker_Nasrani_10.5820.0820.336
Ker_nasrani_20.6480.0750.278
Ker_Tam_Brah_10.7140.1260.16
Ker_Varma_10.7410.0980.161
Kurumba0.3520.0360.612
Maha_Kayastha_10.480.250.27
Marathi_Brah_10.6120.1490.239
Marathi_SKP_10.4580.2450.297
Marathi_Urdu_Mus_10.4590.2470.295
Marwari_10.5660.2080.226
Nepali_Brah_10.5170.2520.23
Padmashali_10.5410.0990.36
Pak_Arora_10.680.2870.033
Papuan0.1830.93
Pathan0.670.2540.076
Pathan_Yousafzai_10.7250.277-0.002
Punjab_Airan_10.8780.143-0.021
Punjab_Jatt_20.5860.3380.076
Punjab_Jatt_60.640.30.06
Punjab_Jatt_70.5720.2790.148
Punjab_Ramgarhia_10.6570.2020.142
Punjab_Syed_10.6690.2490.082
Punjabi_Jatt_50.6740.2750.051
Rajas_Rajput_10.6350.2490.116
Rajas_Syed_10.5970.1810.222
Saraswat_Brah_10.5470.2130.239
Sindhi_lohanna_10.7190.3-0.019
Tam_gounder_10.6060.0560.338
Tam_Iyer_10.6910.1590.15
Tam_Iyer_30.6980.1770.124
Tam_Mudaliar_10.5650.0280.407
Tam_Naidu_10.6240.0630.312
Tel_Niyogi_Brah_10.5280.1920.28
Tel_Reddy_10.5910.1120.296
Telegu_Raju_10.6830.0040.313
UP_Awadh_Mus_10.7650.0940.14
UP_Kayastha0.3340.2360.43
UP_mohajjir_10.5190.2430.238
UP_mohajjir_20.7170.240.043
UP_mohajjir_30.4230.3680.209
UP_Mus_Weaver_10.5850.170.244
W_Beng_Brah_10.5110.2240.265
W_Beng_Kayastha_10.4370.1640.399
W_E_Beng_Brah_10.6090.1710.22

29 Replies to “Some admixture coefficients for South Asian Genotype Project members”

  1. Great! Though the Punjab Arain results seem incorrect with the negative AHG and superhigh InPe.

    Could you run your Bengali ones separately with Dai?

      1. did you here of Tao figuring out a simpler way to compute eigenvalues? As someone in medicine,I haven’t used them since undergrad linear algebra and diff eq, but the news sounded cool

        1. Yes, actually Tao contributed to an existing (physics!) paper. That news was the best thing I read in 2019.

          Though the result was a full specification of eigenvectors from eigenvalues alone (without needing to know the whole matrix).

      2. Could it be something much simpler? Looks to me like only the first two coefficients above were obtained by projecting onto the IP and S components, and the last one obtained by a sum rule that they should total to one… so statistical uncertainties in the first two could conspire in some cases to cause the AHG coefficient to come negative. Hence, a lower bound on the statistical error in the numbers above can be estimated from the worst negative offender in the AHG column, which is about 2%. TLDR; I wouldn’t put too much weight the numbers above past the second decimal place, and all the negative entries in AHG column are to be read as 0.

        Razib — do you know if the P in the Marathi SKP stands for Pathare or Panchkalshi?

  2. Great! Though the Punjab Arain results seem incorrect with the negative AHG and superhigh InPe.

    yeah. some of the pakistan groups are getting off-cline.

    yeah i will rerun the bengalis…

  3. Wtf AHG is literally the African looking people now living in the Andamans…now i understand why do some biracial people look so south asian.

    BTW which modern population represents Iran HG? Modern Iranians minus the steppe?

    And what do negative percentages indicate?

    1. All mixed race people produce some ambiguous looking offsprings. I’ve seen some Latinas who can easily pass as native Bangladeshi, also some Horner African and Sudanese women can have pseudo south Asian look. Many gulf Arab also look pseudo south Asian. Look at Bangladesh vs Yemen under-16 football match:
      https://youtu.be/01XfPWIbld4
      Yemenis generally have some significant sub-saharan african admixture i think.
      Iranians don’t score higher steppe than most South Asian IIRC. Modern Balochis score highest Iran_HG?

    2. Highest amount of Iran related ancestry is found in Balochis AFAIK. Within Iran it peaks among Balochis, Bandaris and Mazandaranis.

  4. wasn’t the indus periphery in the model like 1/4th AHG, given that was the average AHG of the three individuals used to model it?

  5. The Chatt_Satnami individual (a Chamar-like group) scores lots of steppe just like the Chamar. Its odd that this individual had 0% Lithuanian in SAGP, and non brahmin Bengalis consistently scoring some Lithuanian get lesser steppe. Perhaps the inflation of extra steppe % in chamar-like groups is due to artifactual reasons too?
    Is the InPe component 1/4 AHG? Could be the reason why cartain groups/Individuals score high InPe and low AHG.
    As for Bengalis, i guess the InPe is capturing their Iran_HG mostly cuz theyre getting high AHG(which also capturing their Dai in this case).

  6. Hi Razib,

    I sent you my data last summer (around august, I believe). Any possibility you will be able to add it to your analysis?

  7. lmfao your results for arains and jats is going to make pakdefense forum go absolutely gaga for their long lost brother from the east: Sher Rasgoolah Khan

  8. Ah Pakdefense, those hallowed halls. I was summarily ejected from that forum some time ago for being less than charitable towards certain Islamist views. And by Islamist, I mean actual Islamist, not the term used by right-wingers for any Muslim they disagree with.

  9. Intrigued by the result on Iyers, as this is the first such breakdown I’ve seen. The curious thing is the combination of lower AHG admixture relative to most groups including other South Indian Brahmin groups as well as lower Steppe admixture relative to North Indian Brahmin groups. Also, Iyers have an internal structure, with Vadama Iyers maintaining an oral history of migration post 1000 AD from Gujarat & UP, other Iyer groups having regular intermarriage with the local Deccan Nobility, and some supposedly drawn from local populations. While these internal divisions were documented by anthropologists, with different subsets refusing to intermarry and even break bread together, anthropologists have a tendency to present relatively recent political accommodations as indicative of some primeval structure. So I’ve always been interested to see if each group actually had that history show in the genetics. One complicating factor is that intermarriage between different groups of Iyers & Iyengars increased over the past 3 generations, especially among the cosmopolitan types likely to show up in small samples. Still, interesting to see the results and to speculate.

    1. >lower AHG
      There are different kinds of indus periphery and they have different amounts of AHG so IDK how much of the different kinds of indus periphery they have. The InPe composition in Kalash would have a different ratio of Shahr BA1 vs Shahr BA2 than the InPe in Iyers.

      Furthermore I suspect that the AHG in Rakhigarhi is somehow underestimated in Shinde’s paper, since on Narasimhan’s PCA Rakhigarhi is closer to AASI type sources than what one would expect with a mere 27% AASI input.
      I would expect the AASI ancestry in the models (when completely separated from Iran ancestry) to go up in the IVC samples, InPe samples and in modern south Asians as well when a proper group AASI sample is published.

  10. Looking through the project members spreadsheet, it’s odd how West_East_Bengal_Brahmin_1 scores more Lithuanian than Bihari Babhan and the other Bengal Brahmins but scores much lower steppe on your qpAdm. Is this a reflection of a failure in the old model or some mislabeling on the qpAdm

    1. Lithuanian??? They did not exist at the time of Aryans invasion/migration. When was the contact between them and people in today’s India?

  11. Not to single you out Kann_Kodava_1, but that is fascinating how high your Indus Periphery versus how low your steppe ancestry and and your AASI is on the lower end as well. Kodavas in Kannada were always thought of as different from surrounding populations and it turns out they are the least steppe influence and the most Indus influenced, if your sample is represenative of other Kodavas. Razib, I wonder if they would make a good model of the IVC inhabitants, or at least the first IVC immigrants into south india?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.