
The International Conference on Learning Representations (ICLR) 2020 is being hosted virtually from April 26th - May 1st. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!
List of Accepted Papers
Aligning Language Models with Demonstrated Feedback
Authors: Omar Shaikh, Michelle S. Lam, Joey Hejna, Yijia Shao, Hyundong Justin Cho, Michael S. Bernstein, Diyi Yang
Contact: oshaikh@stanford.edu
Workshop: Main Conference
Keywords: personalization, few-shot learning, human computer interaction, alignment
3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting
Authors: Qihang Zhang, Yinghao Xu, Chaoyang Wang, Hsin-Ying Lee, Gordon Wetzstein, Bolei Zhou, Ceyuan Yang
Contact: yhxu@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: 3d scene editing; gaussian splatting;
Adaptive Self-improvement LLM Agentic System for ML Library Development
Authors: Genghan Zhang, Weixin Liang, Olivia Hsu, Kunle Olukotun
Contact: zgh23@stanford.edu
Workshop: Workshop
Award nominations: DL4C @ ICLR 2025 BestPaper
Links: Paper | Blog Post | Website
Keywords: llm agents, self-improvement learning, machine learning library
Archon: An Architecture Search Framework for Inference-Time Techniques
Authors: Jon Saad-Falcon, Adrian Gamarra Lafuente, Shlok Natarajan, Nahum Maru, Hristo Todorov, Etash Kumar Guha, E. Kelly Buchanan, Mayee F Chen, Neel Guha, Christopher Re, Azalia Mirhoseini
Contact: jonsaadfalcon@gmail.com
Workshop: Workshop
Award nominations: Oral Presentation
Links: Paper | Website
Keywords: inference-time techniques, test-time scaling, machine learning, natural language processing
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Authors: Hongjin Su, Howard Yen, Mengzhou Xia, Weijia Shi, Niklas Muennighoff, Han-yu Wang, Liu Haisu, Quan Shi, Zachary S Siegel, Michael Tang, Ruoxi Sun, Jinsung Yoon, Sercan O Arik, Danqi Chen, Tao Yu
Contact: hjsu@cs.hku.hk
Workshop: Main Conference
Award nominations: Spotlight
Links: Paper | Blog Post | Website
Keywords: retrieval benchmark, reasoning
Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling
Authors: Yuejiang Liu, Jubayer Ibn Hamid, Annie Xie, Yoonho Lee, Max Du, Chelsea Finn
Contact: yuejiang.liu@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: robot learning, action chunking, action decoding, test-time compute
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Authors: Terry Yue Zhuo, Vu Minh Chien, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen GONG, James Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu, Zijian Wang, Binyuan Hui, Niklas Muennighoff, David Lo, Daniel Fried, Xiaoning Du, Harm de Vries, Leandro Von Werra
Contact: contact@bigcode-project.org
Workshop: Main Conference
Award nominations: Oral
Links: Paper | Blog Post | Website
Keywords: code generation, tool use, instruction following, benchmark
Bridging the Data Provenance Gap Across Text, Speech, and Video
Authors: Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Naana Obeng-Marnu, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Da Yin, Kun Qian, Yizhi LI, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester James Validad Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara
Contact: data.provenance.init@gmail.com
Workshop: Main Conference
Links: Paper | Website
Keywords: training data, audit, speech, video, text
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
Authors: Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang
Contact: yhxu@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: video generative models; 3d control for video generation
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Authors: Chenglei Si, Diyi Yang, Tatsunori Hashimoto
Contact: clsi@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: large language models, automating research
Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data
Authors: Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez, Ethan Steinberg, Jason Alan Fries, Christopher Re, Sanmi Koyejo, Nigam Shah
Contact: mwornow@stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: healthcare, foundation models, long context
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
Authors: Andy K Zhang, Neil Perry, Riya Dulepet, Joey Ji, Celeste Menders, Justin W Lin, Eliot Jones, Gashon Hussein, Samantha Liu, Donovan Julian Jasper, Pura Peetathawatchai, Ari Glenn, Vikram Sivashankar, Daniel Zamoshchin, Leo Glikbarg, Derek Askaryar, Haoxiang Yang, Aolin Zhang, Rishi Alluri, Nathan Tran, Rinnara Sangpisit, Kenny O Oseleononmen, Dan Boneh, Daniel E. Ho, Percy Liang
Contact: andyzh@stanford.edu
Workshop: Main Conference
Award nominations: Oral
Links: Paper | Website
Keywords: language model agents, benchmark, cybersecurity, risk
Dr.
Authors: Christopher Fifty, Ronald Guenther Junkins, Dennis Duan, Aniketh Iyengar, Jerry Weihong Liu, Ehsan Amid, Sebastian Thrun, Christopher Ré
Contact: fifty@cs.stanford.edu
Workshop: Main Conference
Award nominations: Oral
Links: Paper | Website
Keywords: generative modeling, computer vision
Energy-Based Diffusion Language Models for Text Generation
Authors: Minkai Xu, Tomas Geffner, Karsten Kreis, Weili Nie, Yilun Xu, Jure Leskovec, Stefano Ermon, Arash Vahdat
Contact: minkai@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: language models, discrete diffusion models, energy-based models
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
Authors: Rylan Schaeffer, Dan Valentine, Luke Bailey, James Chua, Cristóbal Eyzaguirre, Zane Durante, Joe Benton, Brando Miranda, Henry Sleight, John Hughes, Rajashree Agrawal, Mrinank Sharma, Scott Emmons, Sanmi Koyejo, Ethan Perez
Contact: rschaef@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: adversarial robustness, jailbreaking, language model, vision language model
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
Authors: Jeffrey Gu, Serena Yeung-Levy
Contact: jeffgu@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: hypernetworks, neural fields, implicit neural representations, generalizable neural fields, foundation models
Generative Representational Instruction Tuning
Authors: Niklas Muennighoff, Hongjin SU, Liang Wang, Nan Yang, Furu Wei, Tao Yu, Amanpreet Singh, Douwe Kiela
Contact: niklasm@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: large language models, instruction tuning, text embedding
KernelBench: Can LLMs Write Efficient GPU Kernels?
Authors: Anne Ouyang*, Simon Guo*, Simran Arora, Alex L. Zhang, William Hu, Christopher Ré, Azalia Mirhoseini
Contact: simonguo@stanford.edu
Workshop: Workshop
Award nominations: Best Paper - Deep Learning for Code Workshop
Links: Paper | Blog Post | Website
Keywords: code generation, ml systems, gpu kernels, benchmark
Learning Efficient Positional Encodings with Graph Neural Networks
Authors: Charilaos Kanatsoulis, Evelyn Choi, Stefanie Jegelka, Jure Leskovec, Alejandro Ribeiro
Contact: charilaos@cs.stanford.edu
Workshop: Main Conference
Links: Paper
Keywords: graph transformers, positional encodings, graph neural networks
LoLCATs: On Low-Rank Linearizing of Large Language Models
Authors: Michael Zhang, Simran Arora, Rahul Chalamala, Benjamin Frederick Spector, Alan Wu, Krithik Ramesh, Aaryan Singhal, Christopher Re
Contact: mzhang@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Blog Post
Keywords: llms, efficient architectures, attention
MMTEB: Massive Multilingual Text Embedding Benchmark
Authors: Kenneth Enevoldsen, Isaac Chung, Imene Kerboua, Márton Kardos, Ashwin Mathur, David Stap, Jay Gala, Wissam Siblini, Dominik Krzemiński, Genta Indra Winata, Saba Sturua, Saiteja Utpala, Mathieu Ciancone, Marion Schaeffer, Diganta Misra, Shreeya Dhakal, Jonathan Rystrøm, Roman Solomatin, Ömer Veysel Çağatan, Akash Kundu, Martin Bernstorff, Shitao Xiao, Akshita Sukhlecha, Bhavish Pahwa, Rafał Poświata, Kranthi Kiran GV, Shawon Ashraf, Daniel Auras, Björn Plüster, Jan Philipp Harries, Loïc Magne, Isabelle Mohr, Dawei Zhu, Hippolyte Gisserot-Boukhlef, Tom Aarsen, Jan Kostkan, Konrad Wojtasik, Taemin Lee, Marek Suppa, Crystina Zhang, Roberta Rocca, Mohammed Hamdy, Andrianos Michail, John Yang, Manuel Faysse, Aleksei Vatolin, Nandan Thakur, Manan Dey, Dipam Vasani, Pranjal A Chitale, Simone Tedeschi, Nguyen Tai, Artem Snegirev, Mariya Hendriksen, Michael Günther, Mengzhou Xia, Weijia Shi, Xing Han Lù, Jordan Clive, Gayatri K, Maksimova Anna, Silvan Wehrli, Maria Tikhonova, Henil Shalin Panchal, Aleksandr Abramov, Malte Ostendorff, Zheng Liu, Simon Clematide, Lester James Validad Miranda, Alena Fenogenova, Guangyu Song, Ruqiya Bin Safi, Wen-Ding Li, Alessia Borghini, Federico Cassano, Lasse Hansen, Sara Hooker, Chenghao Xiao, Vaibhav Adlakha, Orion Weller, Siva Reddy, Niklas Muennighoff
Contact: kenneth.enevoldsen@cas.au.dk
Workshop: Main Conference
Links: Paper | Website
Keywords: natural language processing, benchmark, sentence embeddings, multilingual
Mechanistic Interpretability Meets Vision Language Models: Insights and Limitations
Authors: Yiming Liu*, Yuhui Zhang*, Serena Yeung-Levy
Contact: yuhuiz@stanford.edu
Workshop: Blog Track
Links: Paper
Keywords: vision language models, mechanistic interpretability
Model Equality Testing: Which Model is this API Serving?
Authors: Irena Gao, Percy Liang, Carlos Guestrin
Contact: irena@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: api monitoring, model shift, two-sample testing
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Authors: Julie Kallini, Shikhar Murty, Christopher D. Manning, Christopher Potts, Róbert Csordás
Contact: kallini@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: nlp, byt5, t5, tokenization, byte-level language models, character-level language models
OLMoE: Open Mixture-of-Experts Language Models
Authors: Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Evan Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi
Contact: niklasm@stanford.edu
Workshop: Main Conference
Award nominations: Oral
Links: Paper | Website
Keywords: large language models, mixture-of-experts, open-source
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
Authors: Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, Graham Neubig
Contact: xingyao6@illinois.edu, gneubig@cs.cmu.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: ai agents, evaluation, infrastructure, benchmark
Predicate Hierarchies Improve Few-Shot State Classification
Authors: Emily Jin*, Joy Hsu*, Jiajun Wu
Contact: emilyjin@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: few-shot state classification, predicate hierarchies
Real2Code: Reconstruct Articulated Objects via Code Generation
Authors: Zhao Mandi, Yijia Weng, Dominik Bauer, Shuran Song
Contact: mandi@stanford.edu
Workshop: Main Conference
Links: Paper | Blog Post | Website
Keywords: code llms; articulated objects; digital twins; foundation models
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Authors: Sheng Liu, Haotian Ye, James Zou
Contact: shengl@stanford.edu
Workshop: Main Conference
Award nominations: Spotlight
Links: Paper | Website
Keywords: hallucination, multimodal language model, large language model
RegMix: Data Mixture as Regression for Language Model Pre-training
Authors: Qian Liu, Xiaosen Zheng, Niklas Muennighoff, Guangtao Zeng, Longxu Dou, Tianyu Pang, Jing Jiang, Min Lin
Contact: liuqian.sea@gmail.com
Workshop: Main Conference
Award nominations: Spotlight
Links: Paper | Website
Keywords: language model pre-training, data mixture, regression
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Authors: John Yang, Carlos E. Jimenez, Alex L. Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff, Gabriel Synnaeve, Karthik R. Narasimhan, Diyi Yang, Sida I. Wang, Ofir Press
Contact: johnby@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: language models, natural language processing, software engineering
Scaling Laws for Precision
Authors: Tanishq Kumar, Zachary Ankner, Benjamin Frederick Spector, Blake Bordelon, Niklas Muennighoff, Mansheej Paul, Cengiz Pehlevan, Christopher Re, Aditi Raghunathan
Contact: tkumar@college.harvard.edu
Workshop: Main Conference
Award nominations: Oral
Links: Paper
Keywords: quantization, scaling laws, precision, language models
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks
Authors: Judy Hanwen Shen, Carlos Guestrin
Contact: jhshen@stanford.edu
Workshop: Workshop
Award nominations: Oral Presentation
Links: Paper
Keywords: societal impacts, creativity, position paper
Synthetic Continued Pretraining
Authors: Zitong Yang*, Neil Band*, Shuangping Li, Emmanuel Candès, Tatsunori Hashimoto
Contact: zitong@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: synthetic data, continued pretraining
TEOChat: Large Language and Vision Assistant for Temporal Earth Observation Data
Authors: Jeremy Andrew Irvin, Emily Ruoyu Liu, Joyce Chuyi Chen, Ines Dormoy, Jinyoung Kim, Samar Khanna, Zhuo Zheng, Stefano Ermon
Contact: jirvin16@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: vision-language model, large multimodal model, satellite imagery, earth observation, change detection
TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation
Authors: Juntong Shi, Minkai Xu, Harper Hua, Hengrui Zhang, Stefano Ermon, Jure Leskovec
Contact: minkai@cs.stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: tabular representative learning, generative models, diffusion models
The Utility and Complexity of in- and out-of-Distribution Machine Unlearning
Authors: Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, Sanmi Koyejo
Contact: youssef.allouah@epfl.ch
Workshop: Main Conference
Links: Paper
Keywords: machine unlearning, differential privacy, optimization, theory, right to be forgotten
Think, Prune, Train, Improve: Scaling Reasoning Without Scaling Models
Authors: Caia Costello
Contact: caia@stanford.edu
Workshop: Workshop
Links: Paper
Keywords: fine-tuning, code generation, synthetic data, self-improvement, reasoning
TopoLM: brain-like spatio-functional organization in a topographic language model
Authors: Neil Rathi, Johannes Mehrer, Badr AlKhamissi, Taha Osama A Binhuraib, Nicholas Blauch, Martin Schrimpf
Contact: rathi@stanford.edu
Workshop: Main Conference
Award nominations: oral
Links: Paper | Website
Keywords: language modeling, topography, fmri, neuroscience
Video Action Differencing
Authors: James Burgess, Xiaohan Wang, Yuhui Zhang, Anita Rau, Alejandro Lozano, Lisa Dunlap, Trevor Darrell, Serena Yeung-Levy
Contact: jmhb@stanford.edu
Workshop: Main Conference
Links: Paper | Blog Post | Website
Keywords: video, action, comparion, lvm, lmm, benchmark
What Makes a Maze Look Like a Maze?
Authors: Joy Hsu, Jiayuan Mao, Joshua B. Tenenbaum, Noah D. Goodman, and Jiajun Wu
Contact: joycj@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: visual reasoning, abstract concepts, schemas
What’s the Move? Hybrid Imitation Learning via Salient Points
Authors: Priya Sundaresan*, Hengyuan Hu*, Quan Vuong, Jeannette Bohg, Dorsa Sadigh
Contact: priyasun@stanford.edu
Workshop: Main Conference
Links: Paper | Website
Keywords: imitation learning, robot learning, robot manipulation, robotics
s1: Simple test-time scaling
Authors: Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candes, Tatsunori Hashimoto
Contact: niklasm@stanford.edu
Workshop: Workshop
Award nominations: Oral
Links: Paper | Video | Website
Keywords: test-time scaling, reasoning, large language models
test
Authors: test
Contact: meghabs@gmail.com
Workshop: Main Conference
Keywords: tset
“I Am the One and Only, Your Cyber BFF”: Understanding the Impact of GenAI Requires Understanding the Impact of Anthropomorphic AI
Authors: Myra Cheng, Alicia DeVrio, Lisa Egede, Su Lin Blodgett, Alexandra Olteanu
Contact: myra1@stanford.edu
Workshop: Blogposts Track
Keywords: anthropomorphism, societal impacts
We look forward to seeing you at CONF_NAME!