Why physical appearance is an imperfect individual proxy for ancestry

Kalash children

Pictured above are some Kalash children. You notice in the foreground and center a child who could easily pass as European and draw no notice on the streets of Gdansk, Poland. But look at the child right behind her, I would guess she’d draw no notice on the streets of New Delhi!

Though the Kalash are noted for their fair features, most of them look more West Asian than anything else, and from what I can tell as many have a “northwest Indian” phenotype as a “European” one. Genetically we know that they are good proxies for “Ancestral North Indians” (ANI). About ~30% of their ancestry can be modeled as derive from the steppe peoples, such as the Sintashta. Indo-Aryans. The other ~70% of their ancestry is similar to that of the Indus Valley Civilization (IVC) people, which itself can be decomposed as mostly ancient Southwest Eurasian-adjacent (i.e., derived after the Last Glacial Maximum from the ancestors of Zagros farmers) and a minority of ancestry that is more like that of Andaman Island and pre-Neolithic Southeast Asians (“Ancient Ancestral South Indians,” or AASI).

Another thing to note about the Kalash is that they are genetically very homogeneous. This is due to the fact that they live in an isolated region, and their non-Muslim religion means that they have not intermarried with nearby Muslim people. What does this imply? It means that the Indian-looking girl is exactly the same ancestrally as the European-looking girl. Both have the same proportion of AASI and Indo-Aryan ancestry. That being said, the Indian-looking girl exhibits features more like that the AASI than the European-looking girl. Why?

The simple reason is that the genes which vary and encode salient physical features are a much smaller subset than the total genome. Therefore, they are subject to much higher variance from individual to individual (lower N in the denominator).

Here’s a concrete example. Compare eye color to inferring total ancestry and your total ancestry. Modern SNP-array ancestry inference relies on 100,000 to 1 million genomic positions. It is pretty good as a proxy for the 10 to 100 million SNPs out of your 3 billion base pairs that define your variable ancestry. For eye color, there are a few dozen genes at most, and more honestly a handful that really impacts variation. For Europeans, 75% of the variation of blue vs. non-blue eye color is due to variation around one genetic region, the HERC2-OCA2 locus. This means that just because someone has blue eyes, one can’t be sure that one has much European ancestry at all!

In the 1000 Genomes South Asian populations the SNPs for “blue eyes” are 2 to 10% frequency by population. Since the expression is recessive (you need both copies of the “blue eye” variant), assuming just this SNP you’d expect 0.05% to 1% manifestation of the characteristic in Indian-origin populations. The people with blue eyes have no more or less European ancestry than anyone else in their family.

Where does this leave us? You should understand from this that within a given family or ethnic group there is going to be a range of appearances, and a range is normal within many groups without exotic ancestry. Most Bengalis have 5-20% East Asian ancestry (closer to 5 in West Bengal, closer to 20 in Comilla and Chittagong). This means most of their ancestry is South Asian, and most Bengalis look just like other Indian-origin people. But a substantial minority look somewhat East Asian, to varying degrees. This is exactly what you expect when you have a minority quantum of ancestry.

Finally, many of the commenters here made a lot of assumptions about vloggers talking about their ancestry and were quite rude. I wish you wouldn’t do that. As a matter of fact, many of the inferences may actually be correct, but you don’t know for sure, and you don’t know the whole story. I’m pretty liberal on the comments of this weblog, but if you exhibit a serial pattern of rudeness I’m going to start randomly deleting your comments (if you complain about this I will immediately ban your IP).

4+

18 Replies to “Why physical appearance is an imperfect individual proxy for ancestry”

  1. Can you ever please give a list containing genetic make ups of all ethnicities,religions, races and tribes of the world? And also the genetic gaps between the groups?

    0
  2. This is very interesting to me. Thanks RK. I’d love to see an analysis on the Hunza people. The Hunzas look very similar to the Kalash people, and they live very close to one another. However, the Hunza should *NOT* have any Steppe Ancestry, since they don’t speak an IE language.

    0
    1. However, the Hunza should *NOT* have any Steppe Ancestry, since they don’t speak an IE language.

      there is plenty of steppe ancestry in non-indo-european ppl. it is present in south india, among the basques.

      1+
        1. @Brown_Pundit_Man
          Why do you think that Steppe Ancestry is present in the Hunza, Basque (i.e. Eskaru), and Dravidian

          Because extensive genetic testing shows Steppe ancestry in all those groups. The puzzle regarding Basque in particular now is why they don’t have an IE language as their not really all that different genetically from the French, Germans and Brits.

          0
          1. in 2010 even using simple admixture it was obv french basques didn’t have a ‘caucasus-like’ element that french had. it’s not insignificant. basques clearly have more WHG and more EEF % wise than neighbors. that being said, they clearly had a steppe influx. how? why?

            they’re mostly R1b. that’s from the steppe. and guess what, basques were often said to be matrilineal. so that might explain it.

            (there is as you know 5% steppe in non-brahmins in south india and there’s R1a)

            2+
  3. Can you ever please give a list containing genetic make ups of all ethnicities,religions, races and tribes of the world? And also the genetic gaps between the groups?

    there are plenty in the scientific literature. don’t you have any data analysis skills yourself?

    2+
  4. This is one good point.

    As a non-biology person, another thing I didn’t understand before reading some stuff here is that it is possible to model populations in different ways.

    So for example Bangladeshis are East Asian shifted relative to people form Patna.

    So maybe you can model them Patna + East Asian.

    But maybe you can also model them as Pakistani Punjabi + Indian north east + South Indian

    And of course you can model them as being 100% Bangladeshi Ancestry.

    100% Bangladeshi model isn’t that interesting for most South Asians. I mean you don’t need a DNA test to tell you that most likely.

    But knowing that Bangladeshis are East Asian shifted relative to people from Bihar.

    Combined with understanding how looks are imperfectly correlated with genetics (ie even faternal twins can look quite different) is interesting. And clears up confusion about why different south Asian groups look different on average while still having significant overlap.

    0
  5. But maybe you can also model them as Pakistani Punjabi + Indian north east + South Indian

    i have told indian ppl that east bengalis can be modeled as:

    15% burmese. 15% punjabi. and 70% reddy.

    1+
    1. “I don’t know. The Indian-looking girl just looks West Asian. There’s Iranians with that look.”
      Doesn’t look Iranian to me, maybe Balochi look? That look is present among regular South Asians as well.
      “Two of these Kurdish girls look more Indian than the child above.
      https://i.dailymail.co.uk/i/pix/2014/07/29/article-2709336-2019018E00000578-978_634x578.jpg
      I assume the two girls in the right side looks Indian? Looking accurately on their traits, I can observe The leftmost girl and the second girl on the right side mostly have Neolithic farmers traits; the other two have more Robust Palaeolithic traits. The second girl in the left has striking malar bone and malar fat, a feature I see in south Asians of robust variant.

      0
  6. “Indian looking” is an accurate term on the whole except on its extremes. These are looks that are neither Iranian nor Burmese but a whole range of complexions and features in between and beyond. Nothing typical, yet a type of look that encompasses both Punjab and Kerala because it is possible to have looks emanating from one that are typical of the other.
    RK’s article should be widely read in Kashmir where people believe they have some exotic Central Asian ancestry unconnected to South Asia.

    1+
    1. Indian look primarily is a product of IVC+AASI. I often wonder why a significant population of West Asia exhibit Indian look. My guess is Basal Eurasian+ANE+Some sort of Paleolithic connection is behind that.

      0
  7. Would it be more correct to say that physical appearance informs that a certain ancestry is present (at non trace levels) but not the amount?

    And it must work for some cases, ie northern Europeans having more Steppe ancestry have similar phenotypes while southern Europeans with more EEF ancestry have somewhat different ones.

    Might be wrong of course, in theory genetic ancestry and admixture stop being closely associated after a while under some circumstances.

    0

  8. “You notice in the foreground and center a child who could easily pass as European and draw no notice on the streets of Gdansk, Poland.”

    ..and will the same child be conspicuous on the streets of warsaw?

    do you think using fancier town names makes you look smarter?

    sorry, couldn’t resist a bit of trolling after a long time. 🙂

    0

Comments are closed.