
Nathan Lambert
Senior Research Scientist & Post-Training Lead
About Nathan Lambert
Nathan Lambert is a Senior Research Scientist and post-training lead at the Allen Institute for AI (AI2), where he leads work on TÜLU — one of the few fully open post-training pipelines for language models. He is also the author of The RLHF Book, the definitive reference on reinforcement learning from human feedback, and the founder of the Interconnects AI newsletter.
Before AI2, Lambert built the RLHF research team at Hugging Face and contributed reinforcement learning integrations to the widely-used Diffusers library. He holds a Ph.D. from UC Berkeley, where he worked at the intersection of robotics, model-based reinforcement learning, and control, with internships at Facebook AI and DeepMind.
Lambert is one of the most vocal advocates for open-source AI development in the US, regularly writing about the competitive dynamics between closed and open models, and the strategic implications of Chinese open-weight releases.
Career Highlights
- Senior Research Scientist & Post-Training Lead at AI2 (current)
- Led TÜLU post-training pipeline development (applied to Llama and OLMo models)
- Former RLHF Research Team Lead at Hugging Face
- Author of The RLHF Book
- Ph.D. in Electrical Engineering & Computer Sciences, UC Berkeley
- Internships at Facebook AI Research and DeepMind
- Founder of Interconnects AI newsletter
Notable Positions
On Anthropic's Cultural Advantage
Lambert sees Anthropic's success with Claude Code as a cultural phenomenon, not just technical. The company "presents as the least chaotic" of the major labs, and their bet on code tooling has created organic community enthusiasm that marketing can't replicate.
On Pre-Training vs. Post-Training
Pushes back on the "pre-training is dead" narrative, arguing most compute still goes into pre-training and will continue to do so until base model quality saturates — at which point RL compute will simply run longer. Expects $2,000 subscription tiers to emerge in 2026.
On China's Open-Weight Strategy
Describes Chinese companies as realistic about their position: Western companies won't pay for Chinese API subscriptions due to security concerns, so open-weight models are a strategic play for global influence and market access. He expects more open model builders in 2026 than 2025, with many notable ones from China.
On Google's Structural Advantages
Argues Google has a historical advantage in AI infrastructure because they develop everything from top to bottom (custom TPUs, data centers) without paying Nvidia's "insane" margins — a cost advantage that compounds at scale.
Key Quotes
- "The hype over Anthropic's Claude Opus 4.5 model has been absolutely insane... culturally Anthropic is known for betting very hard on code." (on Anthropic)
- "I still think most of the compute is going in at pre-training because you can still make a model better." (on scaling)
- "US models are currently better and we use them... I try Chinese models and I'm like, fun, but I don't go back to it." (on model quality)
Related Reading
- Reinforcement Learning - Lambert's core research area
- Scaling Laws - Central to his analysis of AI progress
- AI Agents - Discusses the agent deployment challenge
