What's Producible May Not Be Reachable: Measuring the Steerability of Generative Models
Keyon Vafa, Sarah Bentley, Jon Kleinberg, Sendhil Mullainathan
Stop evaluating generative tools solely on output quality galleries. Measure whether your users can actually reach their intended outputs through your interface controls. If your model scores high on FID but users can't steer it, you've built a slot machine, not a tool.
Generative models produce impressive outputs in demos, but users with specific goals can't reliably steer them to produce what they actually need.
Method: The paper introduces a mathematical decomposition separating producibility (what a model can generate) from steerability (whether users can actually reach desired outputs through available controls). They formalize steerability as the probability that a user can navigate the control space to achieve their goal, independent of the model's raw generation capabilities. This reframes evaluation from 'what can this model do?' to 'can I make it do what I need?'
Caveats: Framework is model-agnostic but requires defining a goal space and control interface—non-trivial for open-ended creative tasks.
Reflections: How do different interface paradigms (sliders vs. text prompts vs. examples) affect steerability for the same underlying model? · Can we predict steerability from model architecture before deployment? · What's the minimum control dimensionality needed for acceptable steerability in different creative domains?