One of the more interesting and definite aspects of David Reich’s Who We Are and How We Got Here is on caste. In short, it looks like most Indian jatis have been genetically endogamous for ~2,000 years, and, varna groups exhibit some consistent genetic differences.
This is relevant because it makes the social constructionist view rather untenable. The genetic distinctiveness of jati groups is very hard to deny, it jumps out of the data. The assertions about varna are fuzzier. But, on the whole Brahmins across South Asia have the most ancestry from ancient “steppe” groups, while Dalits across South Asia have the least. Kshatriya is closer to Brahmins. Vaisya has lower fractions of “steppe”. And so on. These varna generalizations aren’t as clear and distinct as jati endogamy. Sudras from Punjab may have as much or more “steppe” than South Indian Brahmins. But the coarse patterns are striking.
As a geneticist, and as an irreligious atheist, a lot of the conversations about “caste” are irrelevant to me. They’re semantical.
You can tell me that true Hinduism doesn’t have caste, that it was “invented” by Westerners. They may not have had caste, but the genetical data is clear that South Asians were endogamous for 2,000 years to an extreme degree. Additionally, the classical caste hierarchy seems to correlate with particular ancestry fractions.
Second, you can say Islam, Sikhism, Jainism, and Buddhism don’t have caste. That they picked it up from Hinduism. Or Indian culture. That’s true. But I think Islam, Sikhism, Jainism, and Buddhism are all made up, just like Hinduism. I don’t care if made up ideologies don’t have caste in their made up religious system. I am curious about the revealed patterns genetically.
I have a pretty big data set of South Asians. Some of them are from the 1000 Genomes. Here is where the 1000 Genomes South Asians were collected:
Gujarati Indians from Houston, Texas
Punjabi from Lahore, Pakistan
Bengali from Dhaka, Bangladesh
Sri Lankan Tamil from the UK
Indian Telugu from the UK
Some of the groups showed a lot of genetic variation, so I split them based on how much “Ancestral North Indian” (ANI) they had. So Gujurati_ANI_1 has more ANI than Gujurati_ANI_2 and so forth.
Here is a tree showing pairwise genetic distances between the groups:

The positioning of some groups near each other is an artifact. Dai from China is an outgroup, as are Iranians and Baloch. So all are pushed near each other.
Treemix with 3 migrations shows similar patterns:

Now let’s do a PCA:

Click the link above and you’ll see Bangladeshis are all shifted toward Dai. The Iranians are at the bottom, but nearest to them are the Baloch. Then the Pathan. Then Punjabi_ANI_1.
Let’s zoom in on the South Asian groups.

Do you notice something about the Bangladeshis? They don’t have much South Asian ancestry variation. Their variation is all due to East Asian ancestry, which seems to have a west to east gradient (I’m way on the right edge, and my family is from the eastern edge of eastern Bengali).
This is not the case for Punjabis.

As you can see, the Punjabis sampled in Lahore range form almost Pathan to almost South Indian. This totally shocked me. This is a huge range of variation.
Compare to Gujus:

I’m pretty sure there are only a few Gujurati_ANI_4 because the sampling occurred in Houston, TX (Indian gov. stingy about genetic testing/sampling, so usually done in Diaspora). Notice that Gujus, mostly Hindu, have the same genetic variation as Punjabis!
Now let’s compare Dalits to Brahmins.

To my surprise, Chamars from UP are quite like South Indian Dalits (there is some “steppe” admixture you can detect in Chamars, so they’re not identical). South Indian Brahmins have some local admixture.
What’s my point here?
In some of the comments, there was talk about how Bengali Muslims have their own form of caste. This seems plausible, though I wouldn’t know personally. I wasn’t raised in Bangladesh. But these data make it clear that there’s almost no caste-like structure in Bangladesh genetically. The variation is almost all due to the mixture with East Asian-like people, and that’s almost certainly due to geography (West Bengalis have this, but to a lesser portion, and people from eastern Bangladesh have more than people from western Bangladesh).
In contrast, when you look at the 1000 Genomes Telugu or Tamil sample structure jumps out. First, there are a small number of Brahmins. But there is also a large group which is clearly scheduled caste. Looking at Gujuratis, they are very diverse. The Patels probably anchor the Sudra/ middle-class component dominant in the region.
Punjab looks like the Indian groups, not Bangladesh. I have no idea about Pakistani caste or class dynamics, but the genetics makes it look like the social structure of Hindus. In contrast, Bangladesh looks like a non-South Asian population, with most of the variation being due to geography and proximity to a very different group (Tibeto-Burmans).
I don’t think Bengalis are more punctilious Muslims than Punjabis. I think the social landscape of Bengal emerged out of a frontier expansion which destablized the default Indian caste structures that undergired most societies. In Pakistan Islamicization didn’t perturb the underlying Indianness of Punjabis.