GDP val
/ˌdʒiː diː ˈpiː væl/
What is GDP val?
GDP val is an AI evaluation benchmark introduced by OpenAI in late 2025 to measure how well AI models perform on economically valuable, real-world knowledge work tasks. The name derives from Gross Domestic Product (GDP), as the benchmark draws tasks from occupations and industries that contribute most to economic output.
Unlike traditional AI benchmarks that test abstract reasoning or standardized test performance (where models have largely saturated human-level scores), GDP val focuses on practical professional deliverables.
Key Characteristics
Real Work Products: Tasks produce actual deliverables like legal briefs, engineering blueprints, customer support conversations, nursing plans, slides, spreadsheets, and multimedia.
Expert Evaluation: Experienced professionals from relevant occupations blindly compare AI outputs against human-generated work, not knowing which is which.
Comprehensive Scope: The full dataset includes 1,300+ specialized tasks across 44 occupations.
Context-Rich Tasks: Unlike simple prompts, GDP val tasks include reference files and context, mimicking real work scenarios.
Why GDP val Matters
GDP val represents a shift in how AI progress is measured. Traditional IQ-style benchmarks have become saturated—frontier models already match or exceed top human performance on standardized tests. GDP val instead measures:
- Economic Impact: Direct connection to tasks that drive GDP
- Professional Competition: Head-to-head comparison with industry experts
- Practical Value: Real deliverables, not abstract problem-solving
As Wharton professor Ethan Mollick noted, GPT-5.2's 71% GDP val score means the model now beats human experts 71% of the time on tasks requiring 4-8 hours of work.
Historical Context
OpenAI introduced GDP val in September 2025, notably publishing results showing Claude outperformed their own best model at launch—a rare display of transparency about competitive positioning.
By December 2025, GPT-5.2 achieved 71% on GDP val, up from 39% for GPT-5.1 released just one month prior, demonstrating rapid progress on knowledge work capability.
Related Reading
- Ethan Mollick - Frequently analyzes GDP val implications
- Enterprise AI - The business applications GDP val measures
- Knowledge Work Disruption - The trend GDP val quantifies
Mentioned In

Paul Ritzer at 00:14:00
"GDP val basically measures how good AI is at real-world knowledge work tasks, spanning legal briefs, engineering blueprints, customer support, and nursing plans."
