Towards Human-AI Complementarity in Matching Tasks
Adrian Arnaiz-Rodriguez, Nina Corvelo Benz, Suhas Thejaswi, Nuria Oliver, Manuel Gomez-Rodriguez
Stop deploying matching algorithms as black-box recommendations. Build interfaces that expose algorithmic uncertainty and route edge cases to human judgment. Best for high-stakes domains like foster care placement or organ allocation where context matters.
Algorithmic matching systems in healthcare and social services don't improve human decisions. Humans using AI often perform worse than either the human or algorithm alone.
Method: The researchers built confidence-aware interfaces that show when the algorithm is uncertain. The system uses a complementarity score—measuring whether human-AI collaboration beats solo performance—and surfaces cases where human judgment adds value. In their healthcare matching experiments, the interface flags low-confidence predictions and lets humans override with domain knowledge the algorithm lacks.
Caveats: Requires ground-truth data to calibrate confidence scores. Won't work in domains where you can't measure prediction certainty.
Reflections: How do you design confidence displays that don't overwhelm users with uncertainty information? · What's the optimal threshold for routing decisions to humans vs. algorithms? · Can complementarity scores generalize across different matching domains?