Newsfeed / Andrej Karpathy: We're Building Ghosts, Not Animals
Dwarkesh Patel·October 17, 2025

Andrej Karpathy: We're Building Ghosts, Not Animals

Karpathy explains why LLMs are 'ethereal spirit entities' mimicking humans, not evolved intelligences - and why reliable AI agents are a decade away.

Andrej Karpathy: We're Building Ghosts, Not Animals

Why Karpathy's "Ghosts vs Animals" Framing Matters

This is Andrej Karpathy at his most philosophical - not teaching neural networks, but wrestling with what we're actually building. The "ghosts, not animals" framing is provocative and important.

The core insight: LLMs emerged from a fundamentally different optimization process than biological intelligence. Animals are evolved - they come with massive amounts of hardcoded hardware. A zebra runs minutes after birth. That's not reinforcement learning, that's millions of years of evolution encoding weights into DNA through some mechanism we don't understand. LLMs, by contrast, are trained by imitating internet documents. They're "ethereal spirit entities" - fully digital, mimicking humans, starting from a completely different point in the space of possible intelligences.

"Decade of agents, not year of agents" is Karpathy pushing back on lab hype. He's been in AI for 15 years, watched predictions fail repeatedly, and has calibrated intuitions. The problems are tractable but difficult. When would you actually hire Claude as an intern? You wouldn't today because it just doesn't work reliably enough. That gap will take a decade to close.

Pre-training as "crappy evolution" is a useful mental model. Evolution gives animals a starting point with built-in algorithms and representations. Pre-training does something analogous but through a practically achievable process - pattern completion on internet documents. The interesting nuance: pre-training does two things simultaneously: (1) picks up knowledge, and (2) boots up intelligence circuits through observing algorithmic patterns. Karpathy thinks the knowledge part might actually be holding models back - making them rely too much on memorization rather than reasoning.

The compression difference explains a lot. Llama 3 stores about 0.7 bits per token from its 15 trillion token training set. The KV cache during inference stores 320 kilobytes per token - a 35 million fold difference. Anything in the weights is a "hazy recollection." Anything in context is working memory, directly accessible. This explains why in-context learning feels more intelligent than what's baked into weights.

8 Insights From Karpathy on LLMs and Agent Development

  • "Ghosts, not animals" - LLMs are digital entities mimicking humans, not evolved intelligences with hardcoded hardware
  • Decade of agents, not year - Current agents are impressive but cognitively lacking; reliable "AI employees" are 10 years out
  • Pre-training is crappy evolution - A practically achievable way to get starting representations, but very different from biological optimization
  • Knowledge might hurt - Models that rely less on memorized knowledge and more on reasoning might be better at novel problems
  • Working memory vs hazy recollection - KV cache (context) is 35 million times more information-dense than weights per token
  • In-context learning may run internal gradient descent - Some papers suggest attention layers implement something like optimization
  • Missing brain parts - Transformer ≈ cortical tissue, reasoning traces ≈ prefrontal cortex, but many structures remain unexplored
  • Early agent attempts were premature - Universe project (2016) failed because models lacked representational power; had to get LLMs first

What This Means for AI Architecture

We're not building artificial humans - we're building something entirely new. LLMs are "ghosts" that emerged from imitating text, not "animals" shaped by evolution. Understanding this difference is essential for building systems that complement rather than poorly imitate human intelligence.

Related