Compositionality in Computer Vision
Held in conjunction with CVPR 2020 in Seattle, US
People understand the world as a sum of its parts. Events are composed of other actions, objects can be broken down into pieces, and this sentence is composed of a series of words. When presented with new concepts, people can decompose the novelty into familiar parts. Our knowledge representation is naturally compositional. Unfortunately, many of the underlying architectures that catalyze vision tasks generate representations that are not compositional.
In our workshop, We will discuss compositionality in computer vision --- the notion that the representation of the whole should be composed of the representation of its parts. As humans, our perception is intertwined greatly by reasoning through composition: we understand a scene by components, a 3D shape by parts, an activity by events, etc. We hypothesize that intelligent agents also need to develop compositional understanding that is robust, generalizable, and powerful. In computer vision, there was a long-standing line of work based on semantic compositionality such as part-based object recognition. Pioneering statistical modeling approaches have built hierarchical feature representations for numerous vision tasks. And more recently, recent works has demonstrated that concepts can be learned from only a few examples using a compositional representation. As we move towards higher-level reasoning tasks, our workshop aims at revisiting the idea and reflecting on the future directions of compositionality.
At the workshop, we would like to discuss the following questions. How should we represent composition in scenes, videos, 3D spaces and robotics? How can human perception shed light on compositional understanding algorithms? What are the benefits of exploring compositionality? What structures, architectures and learning algorithms help models learn compositionality? How do we find the balance between compositional and black-box-based understanding? What problems are there in the current compositional understanding methods and how can we remedy them? What efforts should our community make in the future? What inductive biases can be build into our architectures to improve few-shot learning, meta learning and compositional decomposition?
To receive notifications about updates related to the workshop, sign up using this form.
Program Schedule - TBA
Call for Papers
This workshop aims to bring together researchers from both academia and industry interested in addressing various aspects of compositional understanding in computer vision. The domains include but are not limited to scene understanding, video analysis, 3D vision and robotics. For each of these domains, we will discuss the following topics:
- Algorithmic approaches: How should we develop and improve representations of compositionality for learning, such as graph embedding, message-passing neural networks, probabilistic models, etc.?
- Evaluation methods: What are the convincing metrics to measure the robustness, generalizability, and accuracy of compositional understanding algorithms?
- Cognitive aspects: How would cognitive science research inspire computational model to capture compositionality as humans do?
- Optimization and scalability challenges: How should we handle the inherent representations of different components and curse of dimensionality of graph-based data? How should we effectively collect large-scale databases for training multi-tasking models?
- Domain-specific applications: How should we improve scene graph generation, spatio-temporal-graph-based action recognition, structural 3D recognition and reconstruction, meta-learning, reinforcement learning, etc.?
- Any other topic of interest for compositionality in computer vision.
We invite researchers and practitioners to submit their work to a CMT portal (TBA).
Please contact Jingwei Ji or Ranjay Krishna with any questions: jingweij / ranjaykrishna [at] cs [dot] stanford [dot] edu.
- Shyamal Buch - Stanford University
- Chien-Yi Chang - Stanford University
- Apoorva Dornadula - Stanford University
- Yong-Lu Li - Shanghai Jiao Tong University
- Bingbin Liu - Carnegie Mellon University
- Karttikeya Mangalam - University of California, Berkeley
- Kaichun Mo - Stanford University
- Samsom Saju - Mindtree
- Gunnar Sigurdsson - Carnegie Mellon University
- Paroma Varma - Stanford University
- More to come...
If you are interested in taking a more active part in the workshop, apply to join the program committee using this link.