Newsfeed / Glossary / Scaling Laws
research

Scaling Laws

Pronunciation

SKAY-ling lawz

Definition

Scaling laws describe the empirical relationship between model performance and three key variables: model size (parameters), dataset size, and compute budget. The famous insight: performance improves predictably as you scale these factors.

The Scaling Era (2020-2025)

From GPT-3 to GPT-4, the dominant strategy was simple: make everything bigger.

  • More parameters
  • More training data
  • More compute

This worked remarkably well, leading to dramatic capability improvements with each generation.

Signs of Diminishing Returns

Key figures are now questioning whether scaling alone can continue:

"Is the belief really that if you just 100x the scale everything would be transformed? I don't think that's true." — Ilya Sutskever

"There's a lot of room between exponential and asymptotic." — Demis Hassabis

The New Formula

Demis Hassabis describes DeepMind's approach:

"We operate on 50% scaling, 50% innovation. Both are required for AGI."

What's Changing

  1. Pre-training data is finite - we're running out of high-quality text
  2. Returns aren't exponential - improvements are incremental, not revolutionary
  3. Research matters again - breakthroughs require innovation, not just resources

The Eras of AI

Ilya Sutskever's framing:

  • 2012-2020: Research era (deep learning breakthroughs)
  • 2020-2025: Scaling era (bigger is better)
  • 2025+: Return to research (new paradigms needed)

Mentioned In

Is the belief really that if you just 100x the scale everything would be transformed? I don't think that's true.

Ilya Sutskever at 00:15:00

"Is the belief really that if you just 100x the scale everything would be transformed? I don't think that's true."

There's a lot of room between exponential and asymptotic. We operate on 50% scaling, 50% innovation.

Demis Hassabis at 00:28:00

"There's a lot of room between exponential and asymptotic. We operate on 50% scaling, 50% innovation."

Related Terms