Postdoctoral Research Fellow
Department of Computer Science
Stanford University
Email: rhgao[AT]cs[DOT]stanford[DOT]edu
I am a SAIL Postdoctoral Fellow working with Prof. Jiajun Wu, Prof. Fei-Fei Li, and Prof. Silvio Savarese at the Stanford Vision and Learning Lab. I received my Ph.D. at The University of Texas at Austin advised by Prof. Kristen Grauman, and my B.Eng. from The Chinese University of Hong Kong. My research interests are mainly in computer vision and machine learning. Particularly, I am interested in multisensory learning with sight, sound, and touch. My research goal is to teach machines to see, hear, and feel like humans to perceive, understand, and interact with the multisensory world.
The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects
Ruohan Gao*, Yiming Dou*, Hao Li*, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
PDF
Project Page
ObjectFolder Real Demo
Code
RealImpact: A Dataset of Impact Sound Fields for Real Objects
Samuel Clarke, Ruohan Gao, Mason Wang, Mark Rau, Julia Xu, Jui-Hsien Wang, Doug James, Jiajun Wu
Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
(Highlight)
PDF
Supp
Learning Object-centric Neural Scattering Functions for Free-viewpoint Relighting and Scene Composition
Hong-Xing Yu*, Michelle Guo*, Alireza Fathi, Yen-Yu Chang, Eric Ryan Chan, Ruohan Gao, Thomas Funkhouser, Jiajun Wu
Transactions on Machine Learning Research (TMLR), 2023.
PDF
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
Ruohan Gao*, Hao Li*, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu
International Conference on Robotics and Automation (ICRA), 2023.
PDF
Project Page
Code
Differentiable Physics Simulation of Dynamics-Augmented Neural Objects
Simon Le Cleac'h, Hong-Xing Yu, Michelle Guo, Taylor A Howell, Ruohan Gao, Jiajun Wu, Zachary Manchester, Mac Schwager
Robotics and Automation Letters (RA-L), 2023.
PDF
An Extensible Multi-modal Multi-task Object Dataset with Materials
Trevor Scott Standley, Ruohan Gao, Dawn Chen, Jiajun Wu, Silvio Savarese
International Conference on Learning Representations (ICLR), 2023.
PDF
See, Hear, Feel: Smart Sensory Fusion for Robotic Manipulation
Hao Li*, Yizhi Zhang*, Junzhe Zhu, Shaoxiong Wang, Michelle A Lee, Huazhe Xu, Edward Adelson, Li Fei-Fei, Ruohan Gao†, Jiajun Wu†
Conference on Robot Learning (CoRL), 2022.
PDF
Supp
Project Page
Code
ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer
Ruohan Gao*, Zilin Si*, Yen-Yu Chang*, Samuel Clarke, Jeannette Bohg, Li Fei-Fei, Wenzhen Yuan, Jiajun Wu.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
PDF
Supp
Project Page
Dataset/Code
Visual Acoustic Matching
Changan Chen, Ruohan Gao, Paul Calamia, Kristen Grauman.
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
(Oral Presentation)
PDF
Project Page
Code
Media Coverage
ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and Tactile Representations
Ruohan Gao, Yen-Yu Chang*, Shivani Mall*, Li Fei-Fei, Jiajun Wu.
Conference on Robot Learning (CoRL), 2021.
PDF
Supp
Project Page
Dataset/Code
DiffImpact: Differentiable Rendering and Identification of Impact Sounds
Samuel Clarke, Negin Heravi, Mark Rau, Ruohan Gao, Jiajun Wu, Doug James, Jeannette Bohg.
Conference on Robot Learning (CoRL), 2021.
(Oral Presentation)
PDF
Project Page
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
Rishabh Garg, Ruohan Gao, Kristen Grauman.
British Machine Vision Conference (BMVC), 2021.
(Oral Presentation) [Best Paper Award Runner Up]
PDF
Project Page
Dataset
Look and Listen: From Semantic to Spatial Audio-Visual Perception
Ruohan Gao
Ph.D. Dissertation, UT Austin, 2021.
Michael H. Granof University's Best Doctoral Dissertation Award
UT Austin Outstanding Dissertation Award in Mathematics, Engineering, Physical Science, and Biological and Life Sciences
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao and Kristen Grauman.
Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
PDF
Supp
Project Page
Code
Media Coverage
Learning to Set Waypoints for Audio-Visual Navigation
Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh K. Ramakrishnan, Kristen Grauman.
International Conference on Learning Representations (ICLR), 2021.
PDF
Project Page
Code
VisualEchoes: Spatial Image Representation Learning through Echolocation
Ruohan Gao, Changan Chen, Ziad Al-Halah, Carl Schissler, Kristen Grauman.
European Conference on Computer Vision (ECCV), 2020.
PDF
Supp
Data
Project Page
Listen to Look: Action Recognition by Previewing Audio
Ruohan Gao, Tae-Hyun Oh, Kristen Grauman, Lorenzo Torresani.
Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
PDF
Supp
Poster
Project Page
Code
Co-Separating Sounds of Visual Objects
Ruohan Gao and Kristen Grauman.
International Conference on Computer Vision (ICCV), 2019.
PDF
Supp
Poster
Project Page
Code
2.5D Visual Sound
Ruohan Gao and Kristen Grauman.
Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
(Oral Presentation) [Best Paper Award Finalist]
PDF
Project Page
Dataset
Code
Media Coverage
Oral Video
Learning to Separate Object Sounds by Watching Unlabeled Video
Ruohan Gao, Rogerio Feris, Kristen Grauman.
European Conference on Computer Vision (ECCV), 2018.
(Oral Presentation)
PDF
Supp
Poster
Project Page
Code
Oral Video
ShapeCodes: Self-Supervised Feature Learning by Lifting Views to Viewgrids
Dinesh Jayaraman, Ruohan Gao, Kristen Grauman.
European Conference on Computer Vision (ECCV), 2018.
PDF
Supp
Im2Flow: Motion Hallucination from Static Images for Action Recognition
Ruohan Gao, Bo Xiong, Kristen Grauman.
Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
(Oral Presentation)
PDF
Supp
Poster
Project Page
Code
Oral Video
On-Demand Learning for Deep Image Restoration
Ruohan Gao and Kristen Grauman.
International Conference on Computer Vision (ICCV), 2017.
PDF
Supp
Poster
Project Page
Code
Object-Centric Representation Learning from Unlabeled Videos
Ruohan Gao, Dinesh Jayaraman, Kristen Grauman.
Asian Conference on Computer Vision (ACCV), 2016.
PDF
Poster
Project Page
Ruohan Gao, Huanle Xu, Pili Hu, Wing Cheong Lau, “Accelerating Graph Mining Algorithms via Uniform Random Edge Sampling”, IEEE ICC, 2016. [PDF]
Ruohan Gao, Pili Hu, Wing Cheong Lau, “Graph Property Preservation under Community-Based Sampling”, IEEE Globecom, 2015. [PDF]
Ruohan Gao, Huanle Xu, Pili Hu, Wing Cheong Lau, “Accelerating Graph Mining Algorithms via Uniform Random Edge Sampling (Poster)”, ACM Conference on Online Social Networks (COSN), 2015.