The Conference on Computer Vision and Pattern Recognition (CVPR) 2022 is taking place June 19-24. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!
List of Accepted Papers
Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction
Authors: Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua Tenenbaum, Chuang Gan
Contact: kaichun@cs.stanford.edu
Links: Paper | Video | Website
Keywords: fixing malfunctional 3d shapes, shape functionality, dynamic model
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior
Authors: Davis Rempe, Jonah Philion, Leonidas Guibas, Sanja Fidler, Or Litany
Contact: drempe@stanford.edu
Links: Paper | Website
Keywords: autonomous vehicles, adversarial scenario generation, traffic simulation
Measuring Compositional Consistency for Video Question Answering
Authors: Mona Gandhi, Mustafa Omer Gul, Eva Prakash, Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala
Contact: momergul@alumni.stanford.edu
Links: Paper | Video | Website
Keywords: compositionality, video question answering, evaluation, dataset, metrics
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Authors: Weixin Liang*, Yuhui Zhang*, Yongchan Kwon*, Serena Yeung, James Zou
Contact: yuhuiz@stanford.edu
Links: Paper | Website
Keywords: multi-modal representation learning, contrastive representation learning, cone effect, modality gap
Multi-Objective Diverse Human Motion Prediction with Knowledge Distillation
Authors: Hengbo Ma, Jiachen Li, Ramtin Hosseini, Masayoshi Tomizuka, Chiho Choi
Contact: hengbo_ma@berkeley.edu; jiachen_li@stanford.edu
Award nominations: Oral presentation
Links: Paper
Keywords: human motion prediction, robotics
ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
Authors: Ruohan Gao*, Zilin Si*, Yen-Yu Chang*, Samuel Clarke, Jeannette Bohg, Li Fei-Fei, Wenzhen Yuan, Jiajun Wu
Contact: rhgao@cs.stanford.edu
Links: Paper | Video | Website
Keywords: multisensory, object, dataset, sim2real
PartGlot: Learning Shape Part Segmentation from Language Reference Games
Authors: Juil Koo, Ian Huang, Panos Achlioptas, Leonidas Guibas, Minhyuk Sung
Contact: ianhuang@stanford.edu
Links: Paper | Video | Website
Keywords: language grounding, semantic part segmentation, multimodal learning, natural language processing, 3d vision
Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders
Authors: Mikaela Angelina Uy*, Yen-yu Chang*, Minhyuk Sung, Purvi Goel, Joseph Lambourne, Tolga Birdal, Leonidas Guibas
Contact: mikacuy@stanford.edu
Links: Paper | Video | Website
Keywords: reverse engineering, cad, shape modeling, editing, segmentation, point clouds
Programmatic Concept Learning for Human Motion Description and Synthesis
Authors: Sumith Kulal*, Jiayuan Mao*, Alex Aiken §, Jiajun Wu§
Contact: sumith@cs.stanford.edu
Links: Paper | Website
Keywords: hierarchical representation, human motion, video understanding, video synthesis
Revisiting the “Video” in Video-Language Understanding
Authors: Shyamal Buch, Cristóbal Eyzaguirre, Adrien Gaidon, Jiajun Wu, Li Fei-Fei, Juan Carlos Niebles
Contact: shyamal@cs.stanford.edu
Award nominations: Oral Presentation
Links: Paper | Website
Keywords: video understanding, vision and language, multimodal
Rotationally Equivariant 3D Object Detection
Authors: Hong-Xing Yu, Jiajun Wu, Li Yi
Contact: koven@cs.stanford.edu
Links: Paper | Video | Website
Keywords: rotation equivariance, detection, object
We look forward to seeing you at CVPR!