The Conference on Computer Vision and Pattern Recognition (CVPR) 2021 is being hosted virtually from June 19th - June 25th. We’re excited to share all the work from SAIL that’s being presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!

List of Accepted Papers

GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

Authors: Yun Chen*, Frieda Rong*, Shivam Duggal*, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun
Award nominations: Oral, Best Paper Finalist
Links: Paper | Video | Website
Keywords: computer vision, simulation, image simulation, video simulation, self-driving, autonomous driving, 3d vision, computer graphics, robotics

Greedy hierarchical variational autoencoders for large-scale video prediction

Authors: Bohan Wu, Suraj Nair, Roberto Martin-Martin, Li Fei-Fei*, Chelsea Finn*
Keywords: variational autoencoders, video prediction

AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning

Authors: Madeleine Grunde-McLaughlin
Links: Paper | Video | Website
Keywords: visual question answering, compositionality, computer vision, benchmark

ArtEmis: Affective Language for Visual Art

Authors: Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas Guibas
Award nominations: Oral
Links: Paper | Video | Website
Keywords: affective-computing, wikiart, neural-speakers, emotions

DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images

Authors: Joy Hsu, Wah Chiu, Serena Yeung
Links: Paper | Website
Keywords: unsupervised domain adaptation, instance segmentation

Hierarchical Motion Understanding via Motion Programs

Authors: Sumith Kulal*, Jiayuan Mao*, Alex Aiken, Jiajun Wu
Links: Paper | Video | Website
Keywords: neuro-symbolic, motion, primitives, programs

Home Action Genome: Cooperative Compositional Action Understanding

Authors: Nishant Rai
Links: Paper | Website
Keywords: multi modal, multi camera view, multi perspective, action recognition, action localization, atomic actions, scene graphs, contrastive learning, audio-visual, large scale dataset

Joint Learning of 3D Shape Retrieval and Deformation

Authors: Mikaela Angelina Uy, Vladimir G. Kim, Minhyuk Sung, Noam Aigerman, Siddhartha Chaudhuri, Leonidas Guibas
Links: Paper | Video | Website
Keywords: joint learning, retrieval, deformation

Metadata Normalization

Authors: Mandy Lu, Qingyu Zhao, Jiequan Zhang, Kilian M. Pohl, Li Fei-Fei, Juan Carlos Niebles, Ehsan Adeli
Links: Paper | Website
Keywords: metadata, normalization, bias, deep learning, bias-free feature learning

We look forward to seeing you at CVPR 2021!