The Neural Information Processing Systems conference (NeurIPS) 2025 is being hosted in San Diego from December 2nd to 7th. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!

List of Accepted Papers

MCP Explorer: Interactive Learning Experience

Authors: Jiayu He, Sherry Ruan, James Landay
Contact: sruan@cs.stanford.edu
Workshop: NeurIPS Educational Content for the AI Education Resource Showcase (Oral Presentation)
Links: Website
Keywords: model context protocol (mcp), ai assistants, interactive learning, responsible ai, tool use


Preference Learning with Response Time

Authors: Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis
Contact: ayushsaw@stanfored.edu
Workshop: Main Conference
Links: Video
Keywords: rlhf, preference learning, orthogonal statistics,


Procurement Auctions with Predictions: Improved Frugality for Facility Location

Authors: Eric Balkanski ~Eric_Balkanski2 , Nicholas DeFilippis, Vasilis Gkatzelis, Xizhi Tan
Contact: xizhi@stanford.edu
Workshop: Main Conference
Keywords: frugality, mechanism design, procurement auction


Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

Authors: Wentse Chen, Jiayu Chen, Fahim Tajwar, Hao Zhu, Xintong Duan, Ruslan Salakhutdinov, Jeff Schneider
Contact: zhuhao@stanford.edu
Workshop: Main Conference
Keywords: reinforcement learning, large language models, self-evolving agents


A Practical Guide for Incorporating Symmetry in Diffusion Policy

Authors: Dian Wang, Boce Hu, Shuran Song, Robin Walters, Robert Platt
Contact: dianwang@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: robotic manipulation, equivariance, diffusion model


Agentic Bridge Framework: Closing the Gap Between Agentic Capability and Performance Benchmarks

Authors: Yun Du, Rubens Lacouture, Qizheng Zhang, Genghan Zhang, Tian Zhao, Kunle Olukotun
Contact: yundu27@stanford.edu
Workshop: Workshop
Links: Paper | Website
Keywords: agents, llms, benchmarking, gaia benchmark, ml systems, multi-agent systems, system optimizations, agentic workflows, agentic ai, trace-level telemetry


Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents

Authors: Qizheng Zhang, Michael Wornow, Kunle Olukotun
Contact: qizhengz@stanford.edu
Workshop: Main Conference
Links: Paper | Video
Keywords: caching, memory, serving, llm agents


CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Authors: Anjiang Wei, Tarun Suresh, Jiannan Cao, Naveen Kannan, Yuheng Wu, Kai Yan, Thiago S. F. X. Teixeira, Ke Wang, Alex Aiken
Contact: anjiang@cs.stanford.edu
Workshop: Workshop
Links: Paper | Website
Keywords: agent, large language model, reasoning, code, program synthesis


Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation

Authors: Bailey Trang, Parham Saremi, Alan Wang, Fangrui Huang, Zahra TehraniNasab, Amar Kumar, Tal Arbel, Fei-Fei Li, Ehsan Adeli
Contact: eadeli@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: generative models, diffusion model, diversity, gflownet


DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance

Authors: Maximilian Du, Shuran Song
Contact: maxjdu@stanford.edu
Workshop: Main Conference
Links: Paper | Video | Website
Keywords: robots, steering behaviors, imitation learning


Exploring Diffusion Transformer Designs via Grafting

Authors: Keshigeyan Chandrasegaran, Michael Poli, Daniel Y. Fu, Dongjun Kim, Lea M. Hadzic, Manling Li, Agrim Gupta, Stefano Massaroli, Azalia Mirhoseini, Juan Carlos Niebles, Stefano Ermon, Li Fei-Fei
Contact: keshik@stanford.edu
Workshop: Main Conference
Award nominations: Oral
Links: Paper | Blog Post | Website
Keywords: diffusion transformers, model grafting, architectural editing, hybrid model architectures


Fantastic Bugs and Where to Find Them in AI Benchmarks

Authors: Sang Truong, Yuheng Tu, Michael Hardy, Anka Reuel, Zeyu Tang, Jirayu Burapacheep, Jonathan Perera, Chibuike Uwakwe, Ben Domingue, Nick Haber, Sanmi Koyejo
Contact: sttruong@cs.stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: benchmark, evaluation, measurement theory


From Programs to Poses: Factored Real-World Scene Generation via Learned Program Libraries

Authors: Joy Hsu, Emily Jin, Jiajun Wu, and Niloy J. Mitra
Contact: joycj@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: factorization, library learning, real-world scene generation


HouseLayout3D Benchmark and Training-Free Baseline for 3D Layout Estimation in the Wild

Authors: Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno, Francis Engelmann, Leonidas Guibas
Contact: engelmann@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Video | Website
Keywords: 3d scene understanding, 3d scene generation, cadification


In-Context Learning Strategies Emerge Rationally

Authors: Daniel Wurgaft, Ekdeep Singh Lubana, Core Francisco Park, Hidenori Tanaka, Gautam Reddy, Noah D. Goodman
Contact: wurgaft@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: in-context learning, loss-complexity tradeoff, bayesian modeling, algorithmic complexity


Joint Design of Protein Surface and Structure Using a Diffusion Bridge Model

Authors: Guanlue Li, Xufeng Zhao, Fang Wu, Sören Laue
Contact: fangwu97@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: protein design, diffusion model


LLM-Guided Autoscheduling for Large-Scale Sparse Machine Learning

Authors: Rubens Lacouture, Genghan Zhang, Konstantin Hossfeld, Tian Zhao, Kunle Olukotun
Contact: rubensl@stanford.edu
Workshop: Workshop
Links: Paper
Keywords: sparse machine learning, compiler optimization, autoscheduling


Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Authors: Zhanyi Sun, Shuran Song
Contact: zhanyis@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: out-of-distribution generalization, imitation learning, robotic manipulation


On the Entropy Calibration of Language Models

Authors: Steven Cao, Gregory Valiant, Percy Liang
Contact: shcao@stanford.edu
Workshop: Main Conference
Links: Paper | Video
Keywords: language models, language generation, calibration, entropy, error accumulation, scaling laws, language model theory, rl theory


SATBench: Benchmarking LLMs’ Logical Reasoning via Automated Puzzle Generation from SAT Formulas

Authors: Anjiang Wei, Yuheng Wu, Yingjia Wan, Tarun Suresh, Huanmi Tan, Zhanke Zhou, Sanmi Koyejo, Ke Wang, Alex Aiken
Contact: anjiang@cs.stanford.edu
Workshop: Workshop
Links: Paper | Website
Keywords: reasoning, sat solving, benchmark


SWE-smith: Scaling Data for Software Engineering Agents

Authors: John Yang, Kilian Lieret, Carlos E. Jimenez, Alexander Wettig, Kabir Khandpur, Yanzhe Zhang, Binyuan Hui, Ofir Press, Ludwig Schmidt, Diyi Yang
Contact: johnby@stanford.edu
Workshop: Main Conference
Award nominations: Spotlight
Links: Paper | Blog Post | Video | Website
Keywords: software engineering, language models, swe-bench, swe-agent


SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions

Authors: Xianzhe Fan, Xuhui Zhou, Chuanyang Jin, Kolby Nottingham, Hao Zhu, Maarten Sap
Contact: zhuhao@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: theory of mind, embodied ai, vision-language models


VIPScene: Video Perception Models for 3D Scene Synthesis

Authors: Rui Huang, Guangyao Zhai, Zuria Bauer, Marc Pollefeys, Federico Tombari, Leonidas Guibas, Gao Huang, Francis Engelmann
Contact: engelmann@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Video | Website
Keywords: 3d scene generation, video models


Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Authors: Daniel D. Richman, Jessica Karaguesian, Carl-Mikael Suomivuori, Ron O. Dror
Contact: ddrichma@stanford.edu, jkara@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: protein structure, diffusion


We look forward to seeing you at NeurIPS 2025!