The Computer Vision and Pattern Recognition Conference (CVPR) 2023 is being hosted in Vancouver, Canada on June 18th - 22th. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!

List of Accepted Papers

Finetune like you pretrain: Improved finetuning of zero-shot vision models

Authors: Sachin Goyal, Ananya Kumar, Sankalp Garg, Zico Kolter, Aditi Raghunathan
Contact: sachingo@andrew.cmu.edu
Keywords: robust fine-tuning, multimodal, robustness, clip, transfer learning


Multi-Object Manipulation via Object-Centric Neural Scattering Functions

Authors: Stephen Tian*, Yancheng Cai*, Hong-Xing Yu, Sergey Zakharov, Katherine Liu, Adrien Gaidon, Yunzhu Li, Jiajun Wu
Contact: tians@stanford.edu
Links: | Video | Website
Keywords: dynamics models, neural rendering, robotic manipulation


Accidental Light Probes

Authors: Hong-Xing Yu, Samir Agarwala, Charles Herrmann, Richard Szeliski, Noah Snavely, Jiajun Wu, Deqing Sun
Contact: koven@cs.stanford.edu
Links: Paper | Video | Website
Keywords: inverse rendering, lighting estimation


CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects

Authors: Nick Heppert, Muhammad Zubair Irshad, Sergey Zakharov, Katherine Liu, Rares Andrei Ambrus, Jeannette Bohg, Abhinav Valada, Thomas Kollar
Contact: heppert@cs.uni-freiburg.de
Links: Paper | Video | Website
Keywords: single-shot 3d reconstruction, articulated objects


CIRCLE: Capture in Rich Contextual Environments

Authors: João Pedro Araújo, Jiaman Li, Karthik Vetrivel, Rishi Agarwal, Jiajun Wu, Deepak Gopinath, Alexander William Clegg, C. Karen Liu
Contact: jparaujo@stanford.edu
Links: Paper | Website
Keywords: motion capture, motion generation, virtual reality, egocentric video


EDGE: Editable Dance Generation from Music

Authors: Jonathan Tseng, Rodrigo Castellon, C. Karen Liu
Contact: jtseng20@stanford.edu
Links: Paper | Website
Keywords: motion, diffusion, music, dance, editing


EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision

Authors: Jiahui Lei, Congyue Deng, Karl Schmeckpeper, Leonidas Guibas, Kostas Daniilidis
Contact: leijh@cis.upenn.edu
Links: Paper | Video | Website
Keywords: pointcloud segmentation, equivariance, weakly-supervised learning


Ego-Body Pose Estimation via Ego-Head Pose Estimation

Authors: Jiaman Li, C. Karen Liu†, Jiajun Wu†
Contact: jiamanli@stanford.edu
Award nominations: Award Candidates
Links: Paper | Video | Website
Keywords: egocentric video, human motion estimation, decomposition, conditional diffusion


GINA-3D: Learning to Generate Implicit Neural Assets in the Wild

Authors: Bokui Shen, Xinchen Yan, Charles R. Qi, Mahyar Najibi, Boyang Deng, Leonidas Guibas, Yin Zhou, Dragomir Anguelov
Contact: willshen@stanford.edu
Links: Paper | Video | Website
Keywords: generative ai, autonomous driving, generative 3d, simulation assets


NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations

Authors: Joy Hsu, Jiayuan Mao, and Jiajun Wu
Contact: joycj@stanford.edu
Links: Paper | Website
Keywords: neuro-symbolic learning, visual reasoning, 3d grounding


NeMo: 3D Neural Motion Fields from Multiple Video Instances of the Same Action

Authors: Kuan-Chieh Wang, Zhenzhen Weng, Maria Xenochristou, Joao Pedro Araujo, Jeffrey Gu, C. Karen Liu, Serena Yeung
Contact: wangkua1@stanford.edu
Links: Paper | Video | Website
Keywords: human mesh recovery, human motion, 3d vision, neural field


NeRDi: Single-View NeRF Synthesis With Language-Guided Diffusion As General Image Priors

Authors: Congyue Deng, Chiyu “Max” Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov
Contact: congyue@stanford.edu
Links: Paper | Video | Website
Keywords: nerf, diffusion model, single view to 3d


PROB: Probabilistic Objectness for Open World Object Detection

Authors: Orr Zohar, Kuan-Chieh Wang, Serena Yeung
Contact: orrzohar@stanford.edu
Links: Paper | Video | Website
Keywords: open world learning, open world object detection, object detection, class-agnostic object detection, object detection,


Partial-View Object View Synthesis via Filtering Inversion

Authors: Fan-Yun Sun, Jonathan Tremblay, Valts Blukis, Kevin Lin, Danfei Xu, Boris Ivanovic, Peter Karkus, Stan Birchfield, Dieter Fox, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Marco Pavone, Nick Haber
Contact: fanyun@stanford.edu
Links: Paper | Website
Keywords: view synthesis, partial-view, filtering inversion, gan


Putting People in Their Place: Affordance-Aware Human Insertion into Scenes

Authors: Sumith Kulal, Tim Brooks, Alex Aiken, Jiajun Wu, Jimei Yang, Jingwan Lu, Alexei A. Efros, Krishna Kumar Singh
Contact: sumith@stanford.edu
Links: Paper | Website
Keywords: affordances, self-supervision, image synthesis, editing


RealImpact: A Dataset of Impact Sound Fields for Real Objects

Authors: Samuel Clarke, Ruohan Gao, Mason Wang, Mark Rau, Julia Xu, Jui-Hsien Wang, Doug L. James, Jiajun Wu
Contact: spclarke@stanford.edu
Award nominations: Highlight
Links: Paper | Video | Website
Keywords: audio processing, acoustic learning, multimodal data, sound


SCADE: NeRFs from Space Carving with Ambiguity-Aware Depth Estimates

Authors: Mikaela Angelina Uy, Ricardo Martin-Brualla, Leonidas Guibas, Ke Li
Contact: mikacuy@stanford.edu
Links: Paper | Video | Website
Keywords: nerfs, sparse view, monocular depth, cimle, distribution, ambiguity


Seeing a Rose in Five Thousand Ways

Authors: Yunzhi Zhang, Shangzhe Wu, Noah Snavely, Jiajun Wu
Contact: yzzhang@stanford.edu
Links: Paper | Video | Website
Keywords: generative modelling, inverse rendering, gan, image generation, 3d reconstruction


The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects

Authors: Ruohan Gao*, Yiming Dou*, Hao Li*, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu
Contact: rhgao@cs.stanford.edu
Links: Paper | Video | Website
Keywords: multisensory, benchmark, object-centric learning


We look forward to seeing you at CVPR!