Stanford AI Lab Papers and Talks at NeurIPS 2025

December 1, 2025

The Neural Information Processing Systems conference (NeurIPS) 2025 is being hosted in San Diego from December 2nd to 7th. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!

List of Accepted Papers

MCP Explorer: Interactive Learning Experience

Authors: Jiayu He, Sherry Ruan, James Landay
Contact: sruan@cs.stanford.edu
Workshop: NeurIPS Educational Content for the AI Education Resource Showcase (Oral Presentation)
Links: Website
Keywords: model context protocol (mcp), ai assistants, interactive learning, responsible ai, tool use

Preference Learning with Response Time

Authors: Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis
Contact: ayushsaw@stanfored.edu
Workshop: Main Conference
Links: Video
Keywords: rlhf, preference learning, orthogonal statistics,

Procurement Auctions with Predictions: Improved Frugality for Facility Location

Authors: Eric Balkanski ~Eric_Balkanski2 , Nicholas DeFilippis, Vasilis Gkatzelis, Xizhi Tan
Contact: xizhi@stanford.edu
Workshop: Main Conference
Keywords: frugality, mechanism design, procurement auction

Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

Authors: Wentse Chen, Jiayu Chen, Fahim Tajwar, Hao Zhu, Xintong Duan, Ruslan Salakhutdinov, Jeff Schneider
Contact: zhuhao@stanford.edu
Workshop: Main Conference
Keywords: reinforcement learning, large language models, self-evolving agents

A Practical Guide for Incorporating Symmetry in Diffusion Policy

Authors: Dian Wang, Boce Hu, Shuran Song, Robin Walters, Robert Platt
Contact: dianwang@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: robotic manipulation, equivariance, diffusion model

Agentic Bridge Framework: Closing the Gap Between Agentic Capability and Performance Benchmarks

Authors: Yun Du, Rubens Lacouture, Qizheng Zhang, Genghan Zhang, Tian Zhao, Kunle Olukotun
Contact: yundu27@stanford.edu
Workshop: Workshop
Links: Paper | Website
Keywords: agents, llms, benchmarking, gaia benchmark, ml systems, multi-agent systems, system optimizations, agentic workflows, agentic ai, trace-level telemetry

Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents

Authors: Qizheng Zhang, Michael Wornow, Kunle Olukotun
Contact: qizhengz@stanford.edu
Workshop: Main Conference
Links: Paper | Video
Keywords: caching, memory, serving, llm agents

CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Authors: Anjiang Wei, Tarun Suresh, Jiannan Cao, Naveen Kannan, Yuheng Wu, Kai Yan, Thiago S. F. X. Teixeira, Ke Wang, Alex Aiken
Contact: anjiang@cs.stanford.edu
Workshop: Workshop
Links: Paper | Website
Keywords: agent, large language model, reasoning, code, program synthesis

Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation

Authors: Bailey Trang, Parham Saremi, Alan Wang, Fangrui Huang, Zahra TehraniNasab, Amar Kumar, Tal Arbel, Fei-Fei Li, Ehsan Adeli
Contact: eadeli@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: generative models, diffusion model, diversity, gflownet

DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance

Authors: Maximilian Du, Shuran Song
Contact: maxjdu@stanford.edu
Workshop: Main Conference
Links: Paper | Video | Website
Keywords: robots, steering behaviors, imitation learning

Exploring Diffusion Transformer Designs via Grafting

Authors: Keshigeyan Chandrasegaran, Michael Poli, Daniel Y. Fu, Dongjun Kim, Lea M. Hadzic, Manling Li, Agrim Gupta, Stefano Massaroli, Azalia Mirhoseini, Juan Carlos Niebles, Stefano Ermon, Li Fei-Fei
Contact: keshik@stanford.edu
Workshop: Main Conference
Award nominations: Oral
Links: Paper | Blog Post | Website
Keywords: diffusion transformers, model grafting, architectural editing, hybrid model architectures

Fantastic Bugs and Where to Find Them in AI Benchmarks

Authors: Sang Truong, Yuheng Tu, Michael Hardy, Anka Reuel, Zeyu Tang, Jirayu Burapacheep, Jonathan Perera, Chibuike Uwakwe, Ben Domingue, Nick Haber, Sanmi Koyejo
Contact: sttruong@cs.stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: benchmark, evaluation, measurement theory

From Programs to Poses: Factored Real-World Scene Generation via Learned Program Libraries

Authors: Joy Hsu, Emily Jin, Jiajun Wu, and Niloy J. Mitra
Contact: joycj@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: factorization, library learning, real-world scene generation

HouseLayout3D Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild

Authors: Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno, Francis Engelmann, Leonidas Guibas
Contact: engelmann@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Video | Website
Keywords: 3d scene understanding, 3d scene generation, cadification

In-Context Learning Strategies Emerge Rationally

Authors: Daniel Wurgaft, Ekdeep Singh Lubana, Core Francisco Park, Hidenori Tanaka, Gautam Reddy, Noah D. Goodman
Contact: wurgaft@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: in-context learning, loss-complexity tradeoff, bayesian modeling, algorithmic complexity

Joint Design of Protein Surface and Structure Using a Diffusion Bridge Model

Authors: Guanlue Li, Xufeng Zhao, Fang Wu, Sören Laue
Contact: fangwu97@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: protein design, diffusion model

LLM-Guided Autoscheduling for Large-Scale Sparse Machine Learning

Authors: Rubens Lacouture, Genghan Zhang, Konstantin Hossfeld, Tian Zhao, Kunle Olukotun
Contact: rubensl@stanford.edu
Workshop: Workshop
Links: Paper
Keywords: sparse machine learning, compiler optimization, autoscheduling

Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Authors: Zhanyi Sun, Shuran Song
Contact: zhanyis@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: out-of-distribution generalization, imitation learning, robotic manipulation

On the Entropy Calibration of Language Models

Authors: Steven Cao, Gregory Valiant, Percy Liang
Contact: shcao@stanford.edu
Workshop: Main Conference
Links: Paper | Video
Keywords: language models, language generation, calibration, entropy, error accumulation, scaling laws, language model theory, rl theory

SATBench: Benchmarking LLMs’ Logical Reasoning via Automated Puzzle Generation from SAT Formulas

Authors: Anjiang Wei, Yuheng Wu, Yingjia Wan, Tarun Suresh, Huanmi Tan, Zhanke Zhou, Sanmi Koyejo, Ke Wang, Alex Aiken
Contact: anjiang@cs.stanford.edu
Workshop: Workshop
Links: Paper | Website
Keywords: reasoning, sat solving, benchmark

SWE-smith: Scaling Data for Software Engineering Agents

Authors: John Yang, Kilian Lieret, Carlos E. Jimenez, Alexander Wettig, Kabir Khandpur, Yanzhe Zhang, Binyuan Hui, Ofir Press, Ludwig Schmidt, Diyi Yang
Contact: johnby@stanford.edu
Workshop: Main Conference
Award nominations: Spotlight
Links: Paper | Blog Post | Video | Website
Keywords: software engineering, language models, swe-bench, swe-agent

Authors: Xianzhe Fan, Xuhui Zhou, Chuanyang Jin, Kolby Nottingham, Hao Zhu, Maarten Sap
Contact: zhuhao@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: theory of mind, embodied ai, vision-language models

VIPScene: Video Perception Models for 3D Scene Synthesis

Authors: Rui Huang, Guangyao Zhai, Zuria Bauer, Marc Pollefeys, Federico Tombari, Leonidas Guibas, Gao Huang, Francis Engelmann
Contact: engelmann@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Video | Website
Keywords: 3d scene generation, video models

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Authors: Daniel D. Richman, Jessica Karaguesian, Carl-Mikael Suomivuori, Ron O. Dror
Contact: ddrichma@stanford.edu, jkara@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: protein structure, diffusion

We look forward to seeing you at NeurIPS 2025!

Keep on top of the latest SAIL Blog posts via , , or email: