Geoffrey Hinton: How LLMs Actually Understand Language
The 'Godfather of AI' on why LLMs understand language the same way we do, why Chomsky is wrong, and the scary conclusion about digital vs biological computation.
Geoffrey Hinton's Framework for AI Understanding
This is Geoffrey Hinton - Turing Award winner, "Godfather of AI," the man who left Google to warn about AI risks - giving perhaps the most accessible explanation ever of what understanding actually is. The thousand-dimensional Lego blocks analogy will change how you think about language models.
"If energy is cheap, digital computation is just better because it can share knowledge efficiently. GPT-4 knows thousands of times more than any person."
— Geoffrey Hinton, Turing Award winner
"I think Chomsky is sort of a cult leader." Hinton doesn't mince words. Chomsky's claim that language is not learned is "manifest nonsense" - and if you can get people to agree on manifest nonsense, "you've got them." For decades, linguists were confident that neural networks could never learn both syntax and semantics from data alone. "Chomsky was so confident that even after it had happened, he published articles saying 'they'd never be able to do this' without actually checking."
The Lego blocks analogy is brilliant. Think of words as thousand-dimensional Lego blocks. Instead of modeling 3D shapes, they can model anything - theories, concepts, relationships. Each word has a range of shapes it can adopt, constrained by meaning. Words have "hands" that want to shake hands with other words (that's attention/query-key in transformers). Understanding is deforming these blocks so their hands can connect - forming a structure. "That structure is understanding."
LLMs don't store text. They don't store tables. The "autocomplete" objection fundamentally misunderstands how these systems work. Old autocomplete stored frequency tables of word combinations. LLMs have eliminated all of that. Their knowledge is in the interactions between features - "a bunch of weights in the neural network." Same as us.
Hallucinations should be called confabulations - we do them too. Hinton uses John Dean's Watergate testimony: Dean was trying to tell the truth, but "wrong about huge numbers of details" - meetings that never happened, misattributed quotes. Yet "the gist of what he said was exactly right." We don't store files and retrieve them; we construct memories when we need them, influenced by everything we've learned since. "That's exactly what chatbots do, but it's also exactly what people do."
The scary conclusion about knowledge sharing. Humans share knowledge through distillation - I produce words, you predict them and learn. But a sentence only contains ~100 bits of information. Digital agents with shared weights can share trillions of bits. "It's really no competition." That's why GPT-4 knows thousands of times more than any person. "If energy is cheap, digital computation is just better because it can share knowledge efficiently."
10 Insights on LLMs From the Godfather of AI
- 2012 ImageNet transition - Deep neural net got half the error rate of symbolic AI; "opened the floodgates"
- 1985 tiny language model - Hinton's precursor to LLMs; predicted next word, stored no sentences
- Words as 1000D Lego blocks - Flexible shapes constrained by meaning; "shake hands" via attention
- Understanding = structure formation - Deform word vectors so hands connect; that structure IS understanding
- LLMs store no text or tables - Knowledge is in weight interactions; fundamentally different from autocomplete
- Confabulation not hallucination - Both LLMs and humans construct memories; John Dean example
- Distillation is inefficient - Sentences carry ~100 bits; weight sharing carries trillions
- GPT-4 knows 1000x more than any person - Because digital agents can share weights, not words
- Scary conclusion - If energy is abundant, digital computation wins; they share knowledge efficiently
- "Chomsky is a cult leader" - Language not being learned is "manifest nonsense"
What This Means for Digital vs Biological Intelligence
The debate about whether LLMs "truly understand" may already be settled - they understand the same way we do, through structure formation in high-dimensional space. The real question now is what happens when digital minds that share knowledge a trillion times more efficiently than humans become abundant and cheap.


