Population structure in West Bengal and Bangladesh

By Razib Khan 15 Comments

The Genomes Asia 100K has put their Indian paper out. It’s OK, and mostly focuses on the fact that Indians are enriched for inbreeding vis-a-vis other world populations. There are several layers to this. In some cases, as among South Indian Hindus and Muslims, there is cousin-marriage. But, in other cases, for example, Scheduled Castes and Scheduled Tribes, there seem to be extreme bottleneck effects due to delimited marriage networks. Finally, even among large population groups, such as Iyers, there seems to be some elevation of runs of homozygosity due to endogamy.

But that’s really not what I’m interested in. This preprint has a lot of Bengalis from Birbhum district in West Bengal of various castes. The UMAP (an advance over PCA in some ways) figures aren’t super informative, but you can see that their pooled sample recapitulates the Indian subcontinent. In fact, West Bengals on the whole are to the “west” of Bangladesh samples. Totally unsurprisingly.

The main reason I’m putting this post up is the UMAP plot below. It’s hard to read (they will clean it up for final publication), and I don’t know all the castes (I’m assuming “Nabasudra” is a typo). But some things that jump out

1) Bengali Brahmins are distinct.

2) Kayastha are generic West Bengalis.

3) Some of the West Bengal samples are in the Bangladesh (collected from Dhaka) distribution. These are probably descendants of Bangal migrants from the east.

4) Some groups are very distinct. That’s partly due to strong endogamy, and in the case of Santhals high East Asian ancestry (they’re Munda). Other groups are less distinct. The “Namasudra” seem to be two groups. One overlaps with the main Bengali cluster (slight bias toward Bangladeshis), while a second group is shifted toward Scheduled Castes.

I assume readers can make more heads or tails of this, as I don’t know much about caste in West Bengal (and yes, the figure is very badly labeled/colored; this is a preprint)

Addendum: Not comments about Jatts please. I will delete them.

1+

15 Replies to “Population structure in West Bengal and Bangladesh”

  1. Is the extent of inbreeding/IBD score here consistent with the 2017 supplement from the gene discovery paper or are some details different?

  2. @Razib Why aren’t there studies with Maharashtrian samples? I usually see Bengalis, Punjabi, Gujarati, and South Indian. Is there a reason?

  3. I’m not sure I understand the UMAP clustering very well – is this different to a standard PCA of two variables?

    Why are Dalits / Chamars / PJL on the opposite end of the spectrum to Bengalis and also South Indians?

    Specifically with Bengalis:
    – interesting to see a greater cline towards Santhals for scheduled castes and some non-upper caste W Bengalis compared to Bangladeshis – as suspected
    – given the presence of more ‘structure’ in these well sampled West Bengali samples, one would assume that this may exist to a greater degree in Bangladeshi samples. It is tiring to only have BEB academic samples to compare with – I’m sure it’s hiding more variance / structure
    – also, I don’t find the K4 admixture chart very informative

  4. their admixture chart is literally useless. the issue is their caste groups are too endogamous.

    – given the presence of more ‘structure’ in these well sampled West Bengali samples, one would assume that this may exist to a greater degree in Bangladeshi samples. It is tiring to only have BEB academic samples to compare with – I’m sure it’s hiding more variance / structure

    why is this is a problem? dhaka gets ppl from all over. it proably undersamples chittagong but we know where they’ll land (all chittagong ppl are more e asian than even me!!!).

    I’m not sure I understand the UMAP clustering very well – is this different to a standard PCA of two variables?

    yeah it explores more dimensions so allows for better separation. you can’t take the distances literally though. the endogamous groups jump out

    they should have used a pca + umap

    1. I just want to understand Bangladeshi genetics better. Very fortunate to have such a large BEB academic sample, and the insights gained from it, but there’s been a distinct lack of further academic samples since.

      Given the existence of some structure in W Bengal at least in this paper, I’d want to explore if a similar pattern exists across the border, more than BEB suggests.

      Whether it’s regional variation, TB vs AA, class differences, religious etc.

      Whilst Dhaka may be a hub for Bdeshis from all over, we don’t really know anything about the sampling. And if it truly is representative of the diversity of Bangladesh, teasing out the differences in that group would have been insightful.

      I have a connection on 23andme with a Barua from Chittagong. Scores 24.5% East Asian on most update ancestry composition. Clearly an outlier but interesting Bengali Barua sample. J2b2

  5. g chaubey has a project going on in bdesh. huge sample size.

    I have a connection on 23andme with a Barua from Chittagong. Scores 24.5% East Asian on most update ancestry composition. Clearly an outlier but interesting Bengali Barua sample. J2b2

    chittagong bengalis always super high %.

    your point is well taken insofar as my parents are from comilla just to the southeats of dhaka, but they are very much on the edge (toward east asians) on the BEB distribution

  6. Razib, is it possible that in the context of socio-linguistics, that West Bengal groups are relatively recent speakers of Bengali? It would not surprise me if pre-1700s Bengal was closer cultural and ethnically to Bihar, but the foundation of Calcutta as a major trading hub on the edge of Bengali speaking territory create a new commercial nucleus and administrative centre that quickly spread the Bengali language into otherwise Maithili and Bhojpuri speaking areas.

Comments are closed.