Your Model Is Unfair, Are You Even Aware? Inverse Relationship Between Comprehension and Trust in Explainability Visualizations of Biased ML Models
Zhanna Kaufman, Madeline Endres, Cindy Xiong Bearfield, Yuriy Brun
Audit your ML dashboards for false confidence. If stakeholders trust your model but can't explain its bias metrics back to you, your visualization is backfiring. Test comprehension before deployment, not just usability.
Users trust biased ML models more when explainability visualizations are harder to understand. The easier it is to see the bias, the less they trust the system.
Method: Researchers tested five visualization types (bar charts, confusion matrices, decision trees, feature importance plots, and SHAP plots) across 180 participants. The inverse relationship held across all formats: participants who scored higher on comprehension tests showed significantly lower trust scores. SHAP plots, despite being the most detailed, produced the highest trust among users who couldn't actually interpret them correctly.
Caveats: Study focused on classification tasks with demographic bias. Doesn't address regression models or other bias types like sampling bias.
Reflections: Does the inverse relationship persist when users have domain expertise in ML fairness? · Can hybrid visualizations balance comprehension and trust without sacrificing either? · How does repeated exposure to biased model outputs affect the comprehension-trust relationship over time?