Reasoning Models

Also known as: reasoning model, thinking model, inference-time compute, test-time compute

technical advanced

What are Reasoning Models?

Reasoning models are a class of large language models specifically designed to spend more computation at inference time to solve complex problems. Rather than generating answers in a single forward pass, these models produce extended chains of thought — exploring approaches, checking their work, backtracking when they hit dead ends, and iterating toward a solution. OpenAI’s o1 and o3, Anthropic’s Claude with extended thinking, Google’s Gemini with thinking, and DeepSeek-R1 are prominent examples. They represent a shift in where compute is invested: traditional scaling focuses on pre-training (bigger models, more data), while reasoning models scale inference-time compute.

How They Differ from Standard Models

Standard LLMs generate responses token by token with roughly constant computation per token. Reasoning models allocate variable computation depending on problem difficulty — spending seconds on simple questions and minutes on hard ones. They are trained using reinforcement learning on reasoning traces, rewarding chains of thought that lead to correct answers. The internal reasoning may be visible to users (as with Claude’s extended thinking) or hidden. On complex mathematics, science, and coding tasks, reasoning models dramatically outperform standard models of equivalent size, sometimes matching the performance of much larger conventional models.

Why Reasoning Models Matter

Reasoning models have demonstrated that scaling inference-time compute can be as powerful as scaling pre-training compute, opening a second dimension of improvement for AI capabilities. For practitioners, this means harder problems become tractable: multi-step coding challenges, graduate-level science questions, and complex planning tasks that stump standard models. The tradeoff is cost and latency — reasoning models use more tokens and take longer to respond. Choosing between a standard model and a reasoning model is a practical architectural decision: use standard models for straightforward tasks where speed and cost matter, and reasoning models for complex tasks where accuracy is worth the additional compute.