Compositionality in Computer Vision
June 15th, Held in conjunction with CVPR 2020 in Seattle, US
People understand the world as a sum of its parts. Events are composed of other actions, objects can be broken down into pieces, and this sentence is composed of a series of words. When presented with new concepts, people can decompose the novelty into familiar parts. Our knowledge representation is naturally compositional. Unfortunately, many of the underlying architectures that catalyze vision tasks generate representations that are not compositional.
In our workshop, We will discuss compositionality in computer vision --- the notion that the representation of the whole should be composed of the representation of its parts. As humans, our perception is intertwined greatly by reasoning through composition: we understand a scene by components, a 3D shape by parts, an activity by events, etc. We hypothesize that intelligent agents also need to develop compositional understanding that is robust, generalizable, and powerful. In computer vision, there was a long-standing line of work based on semantic compositionality such as part-based object recognition. Pioneering statistical modeling approaches have built hierarchical feature representations for numerous vision tasks. And more recently, recent works has demonstrated that concepts can be learned from only a few examples using a compositional representation. As we move towards higher-level reasoning tasks, our workshop aims at revisiting the idea and reflecting on the future directions of compositionality.
At the workshop, we would like to discuss the following questions. How should we represent composition in scenes, videos, 3D spaces and robotics? How can human perception shed light on compositional understanding algorithms? What are the benefits of exploring compositionality? What structures, architectures and learning algorithms help models learn compositionality? How do we find the balance between compositional and black-box-based understanding? What problems are there in the current compositional understanding methods and how can we remedy them? What efforts should our community make in the future? What inductive biases can be build into our architectures to improve few-shot learning, meta learning and compositional decomposition?
To receive notifications about updates related to the workshop, sign up using this form.
Program Schedule - TBA
Important Dates and Details
Call for Papers
This workshop aims to bring together researchers from both academia and industry interested in addressing various aspects of compositional understanding in computer vision. The domains include but are not limited to scene understanding, video analysis, 3D vision and robotics. For each of these domains, we will discuss the following topics:
- Algorithmic approaches: How should we develop and improve representations of compositionality for learning, such as graph embedding, message-passing neural networks, probabilistic models, etc.?
- Evaluation methods: What are the convincing metrics to measure the robustness, generalizability, and accuracy of compositional understanding algorithms?
- Cognitive aspects: How would cognitive science research inspire computational model to capture compositionality as humans do?
- Optimization and scalability challenges: How should we handle the inherent representations of different components and curse of dimensionality of graph-based data? How should we effectively collect large-scale databases for training multi-tasking models?
- Domain-specific applications: How should we improve scene graph generation, spatio-temporal-graph-based action recognition, structural 3D recognition and reconstruction, meta-learning, reinforcement learning, etc.?
- Any other topic of interest for compositionality in computer vision.
Submit in this CMT portal: cmt3.research.microsoft.com/CICV2020
We provide three submission tracks, please submit to your desired one:
- Archival full paper track. The length limit is 4 - 8 pages excluding references. The format is the same as CVPR'20 main conference submission (template). Accepted papers in this track will be published in CVPR workshop proceedings and IEEE Xplore. These papers will also be in the CVF open access archive.
- Non-archival short paper track. The length limit is 4 pages including references. The format is the same as CVPR'20 main conference submission (template) but shorter in length. Accepted papers in this track will NOT be published in CVPR workshop proceedings but public on this workshop website. Note that accepted papers in this non-archival short paper track will not conflict with the dual submission policy of ECCV'20.
- Non-archival long paper track. This track is only for previously published papers or papers to appear on CVPR'20 main conference. There is no page limit. Accepted papers in this track will NOT be published in CVPR workshop proceedings.
The submission deadline for all tracks has been extended to April 3rd, 2020 at 11:59 pm PST due to COVID-19 situation. Author notification will be sent out on April 10th, 2020. Camera ready due is April 18th, 2020.
All accepted papers will be required for poster presentation. Oral presentations will be selected from the accepted papers.
Please contact Jingwei Ji or Ranjay Krishna with any questions: jingweij / ranjaykrishna [at] cs [dot] stanford [dot] edu.
- Shyamal Buch - Stanford University
- Chien-Yi Chang - Stanford University
- Apoorva Dornadula - Stanford University
- Yong-Lu Li - Shanghai Jiao Tong University
- Bingbin Liu - Carnegie Mellon University
- Karttikeya Mangalam - University of California, Berkeley
- Kaichun Mo - Stanford University
- Samsom Saju - Mindtree
- Gunnar Sigurdsson - Carnegie Mellon University
- Paroma Varma - Stanford University
- More to come...
If you are interested in taking a more active part in the workshop, apply to join the program committee using this link.