World Models
wurld MOD-els
Definition
World models are AI systems that learn to simulate and predict how the physical world works - including spatial dynamics, intuitive physics, and cause-effect relationships that can't be learned from text alone.
Why It Matters
Current language models learn from text, which captures a lot about the world but misses embodied knowledge - how objects fall, how forces interact, how space works. World models aim to fill this gap.
Key Concepts
Beyond Language
"Language is richer than we thought, but spatial dynamics, intuitive physics, and sensorimotor experience can't be captured in text." — Demis Hassabis
Genie + Simma
Google DeepMind's approach: drop AI agents (Simma) into AI-generated worlds (Genie) and let them interact, creating infinite training environments.
"The two AIs are kind of interacting in the minds of each other."
Physics Accuracy
Generated videos may look realistic but aren't physics-accurate enough for robotics. True world models need to predict physical outcomes correctly.
Applications
- Robotics: Agents need intuitive physics to navigate real environments
- Planning: Understanding cause and effect enables better long-term reasoning
- Simulation: Training in simulated worlds before deploying in reality
Current Limitations
- Video generation looks realistic but doesn't obey physics
- Models lack grounded understanding of spatial relationships
- Online learning (continuing to learn after deployment) is still missing
Related Terms
- Jagged Intelligence - The problem world models may help solve
- Embodied AI - AI systems that interact with physical world

