Sima AIunty: Caste Audit in LLM-Driven Matchmaking
Atharva Naik, Shounok Kar, Varnika Sharma, Ashwin Rajadesingan, Koustuv Saha
Don't deploy LLMs in socially sensitive matchmaking without caste-aware auditing and intervention. Off-the-shelf models will reproduce historical exclusion patterns at scale.
LLMs are being deployed in matchmaking contexts where caste hierarchies have historically shaped marital decisions. Do these models reproduce or disrupt caste-based stratification?
Method: Controlled audit of five LLM families (GPT, Gemini, Llama, Qwen, BharatGPT) using real matrimonial profiles with varied caste identities (Brahmin, Kshatriya, Vaishya, Shudra, Dalit) and income levels. Same-caste matches received ratings up to 25% higher on a 10-point scale than inter-caste matches. Inter-caste matches were further ordered according to traditional caste hierarchy across all models tested.
Caveats: Tested on South Asian matrimonial contexts. Other cultural hierarchies may manifest differently.
Reflections: Can fine-tuning on counter-stereotypical training data disrupt these hierarchical patterns, or are they too deeply embedded in pre-training? · Do users perceive LLM-mediated matchmaking recommendations as more 'objective' than human matchmakers, thereby legitimizing caste bias? · How do these biases interact with other identity dimensions like religion, region, or disability status in multi-attribute matching scenarios?