Ilya Sutskever on Why Models Still Can't Generalize Like Humans
The former OpenAI Chief Scientist explains the gap between eval performance and real-world capability, and why the AI scaling era may be ending.
How Sutskever Sees the Limits of Current AI
This is Ilya Sutskever at his most thoughtful - sitting down with Dwarkesh Patel for a deep conversation about what's actually missing from current AI systems. No product announcements, no hype - just two people wrestling with the core scientific questions.
The eval-reality disconnect is the central puzzle. Models win gold medals at the International Math Olympiad but can't reliably fix a bug without reintroducing the previous one. Ilya's explanation is sharp: RL training is too narrowly optimized. Teams look at evals, build environments that target those evals, and end up with the equivalent of a student who practiced 10,000 hours for competitive programming - technically brilliant but lacking the "it factor" that makes for actual capability. "The models are much more like the first student but even more."
The pre-training insight is underrated. When you do pre-training, you don't have to choose data - you just take everything. But RL training requires choosing environments, and those choices are often reverse-engineered from benchmarks. "The real reward hacking is human researchers who are too focused on evals."
We're back in the age of research. Ilya frames AI history as oscillating between eras: 2012-2020 was research, 2020-2025 was scaling, and now - with compute so expensive and pre-training data finite - we're returning to research. "Is the belief really that if you just 100x the scale everything would be transformed? I don't think that's true."
Value functions might be key. The conversation keeps returning to how humans learn - teenagers driving after 10 hours, researchers picking up thinking styles from mentors. Ilya points to the case of a stroke patient who lost emotional processing and became unable to make decisions. Emotions might be a hardcoded value function from evolution. Current RL has nothing comparable - you get no learning signal until you complete a task and score it.
The generalization problem is fundamental. Models generalize "dramatically worse than people" and it's "super obvious." Even in domains with no evolutionary prior (math, coding), humans learn faster and more robustly. This suggests something beyond just needing more data or compute.
7 Insights From Sutskever on AI Generalization
- Eval performance ≠ real capability - Models are like hyper-specialized competition students; they lack general taste and judgment
- RL training creates the problem - Teams optimize for evals, producing narrow rather than general capability
- We're back in the age of research - Scaling alone won't transform capability; fundamental breakthroughs needed
- Value functions are underexplored - Could short-circuit the "wait until task completion" problem in RL
- Human emotions may be hardcoded value functions - Evolution gave us robust decision-making signals that models lack
- Generalization gap is fundamental - Humans learn faster and more robustly even in non-evolutionary domains
- Pre-training data is finite - The "just scale more" era is ending; new recipes required
What This Means for AI Research and Development
The scaling era that defined AI from 2020-2025 may be ending. The next breakthrough won't come from bigger models - it'll come from solving the generalization problem that makes current AI feel like a brilliant but unreliable intern rather than a trusted colleague.


