AI Research

◆ Upcoming Conferences

NeurIPS

Dec 10–15, 2025

Vancouver, BC

Submissions closed

ICML

Jul 13–19, 2025

Vienna, Austria

◆ Upcoming

ICLR

May 7–11, 2026

Singapore

Submissions open

ACL

Aug 3–8, 2025

Vienna, Austria

Submissions closed

◆ Landmark Papers

◆ ◇ ◆

PAPERS THAT
Changed Everything

Research that permanently shifted how the field thinks about AI. Explained clearly, without losing the math that makes them important.

◆ arXiv 2025

Chain-of-Thought Reasoning Emerges from Scale: New Evidence from Sparse Models

Park, S. et al. · Google DeepMind

New evidence that chain-of-thought reasoning in large models is not merely pattern matching — it exhibits genuine compositional structure that persists under adversarial probing. Tested across sparse MoE architectures at 130B+ parameters.

Key takeaway: reasoning ability is qualitatively different from retrieval and cannot be fully explained by training data memorization.

◆ NeurIPS 2024

Constitutional AI: Harmlessness from AI Feedback — Revisited at 100B Scale

Bai, Y. et al. · Anthropic

The original Constitutional AI paper showed that models could learn to critique and revise their own outputs using human-written principles. This follow-up extends the method to models 40× larger and reveals important failure modes that don't appear at smaller scales.

Key takeaway: Constitutional AI does not scale linearly — larger models require increasingly specific principle hierarchies to maintain alignment.

◆ ICML 2024

Scaling Laws for Reward Model Overoptimization in RLHF

Gao, L. et al. · OpenAI

Demonstrates empirically that RLHF reward models degrade predictably as policy models optimize against them — a phenomenon known as reward hacking. Establishes quantitative thresholds for overoptimization and proposes KL-distance regularization strategies.

Key takeaway: every RLHF training run has an optimal stopping point that is predictable from the reward model's initial quality.

◆ Recent Papers Worth Reading

◆ ◇ ◆

THIS MONTH'S
Best Papers

◆ arXiv · June 2025

Mechanistic Interpretability of Multi-Head Attention in Frontier Models: A Circuit Analysis

Elhage, N. et al. · Anthropic

Uses activation patching and attention head ablation to identify circuits responsible for factual recall, indirect object identification, and basic arithmetic in Claude 3-class models. Finds 12 circuit classes that generalize across model scales.

Takeaway: interpretability is advancing faster than expected — specific behaviors can now be traced to identifiable model components.

◆ arXiv · June 2025

Llama 3 Technical Report: Architecture, Training, and Evaluation at 405B

Meta AI Research Team

The full technical report for Llama 3's flagship 405B parameter model. Covers training data curation at 15T tokens, hardware infrastructure (24K H100s), instruction tuning methodology, safety measures, and comprehensive benchmark comparisons.

Takeaway: open-source has fully caught frontier commercial models on most academic benchmarks — the gap is now in deployment scale and RLHF quality.

◆ Google DeepMind · May 2025

AlphaFold 3 Extends to RNA, DNA, and Small Molecules: A Unified Structural Biology Model

Abramson, J. et al. · Google DeepMind

AlphaFold 3 extends the original protein structure prediction breakthrough to cover all biological molecules. Uses a diffusion architecture rather than the original MSA transformer, achieving state-of-the-art accuracy across protein-DNA, protein-RNA, and protein-ligand interactions simultaneously.

Takeaway: this effectively creates a single AI model for all of structural biology — with implications for drug discovery that won't be fully understood for years.

◆ NeurIPS 2024

Direct Preference Optimization: Your Language Model Is Secretly a Reward Model

Rafailov, R. et al. · Stanford

Proposes DPO as a simpler alternative to RLHF — instead of training a separate reward model and running PPO, DPO directly optimizes language model outputs using preference data. Achieves comparable alignment results at significantly lower computational cost.

Takeaway: DPO has already become the default alignment method for open-source models. Understanding it is now a prerequisite for LLM practitioners.

◆ OpenAI · April 2025

Learning to Reason with LLMs: The OpenAI o3 Technical Report

OpenAI Research Team

The technical document behind OpenAI's o3 model — a system that performs extended chain-of-thought reasoning before answering. Covers the training methodology combining supervised learning with reinforcement learning on verifiable problems, and the compute scaling behavior of extended reasoning.

Takeaway: scaling inference compute (not just training compute) is a new and potentially more efficient axis for capability improvements.

◆ ICLR 2025

Speculative Decoding Does Not Require a Draft Model: Token-Level Tree Attention

Chen, C. et al. · UC Berkeley

Introduces a draft-model-free approach to speculative decoding that uses tree-structured attention to generate multiple candidate token continuations simultaneously. Achieves 3–5× inference speedup without the overhead of maintaining and running a separate draft model.

Takeaway: inference efficiency has become as important as training efficiency — this technique is likely to appear in production LLM deployments within months.

Browse All Research →

◆ Lab Intelligence

◆ ◇ ◆

TRACK THE
Labs

OpenAI

◆ 3 papers this month

Current focus: inference-time compute scaling, GPT-5 architecture research, Sora video generation improvements.

All OpenAI Papers →

Anthropic

◆ 5 papers this month

Current focus: mechanistic interpretability, Constitutional AI refinements, and multi-agent safety frameworks.

All Anthropic Papers →

Google DeepMind

◆ 8 papers this month

Current focus: Gemini architecture, robotics, AlphaFold extensions, and multimodal reasoning.

All DeepMind Papers →

Meta AI

◆ 6 papers this month

Current focus: Llama 4 architecture, FAIR research in video understanding, and open-source tooling.

All Meta AI Papers →

PAPERS THATChanged Everything

THIS MONTH'SBest Papers

TRACK THELabs

ONE PAPEREvery Week

PAPERS THAT
Changed Everything

THIS MONTH'S
Best Papers

TRACK THE
Labs

ONE PAPER
Every Week