When groups robots work together, their actions communicate valuable information. We introduce a collaborative learning and control strategy that enables robots to harness the information contained within their partner's actions.
The NLP community has made great progress on open-domain QA, but our systems still struggle to answer complex open-domain questions in an large collection of text. We present an efficient and explainable method for enabling multi-step reasoning in these systems.
Presenting AC-Teach, a unifying approach to leverage advice from an ensemble of sub-optimal teachers in order to accelerate the learning process of actor-critic reinforcement learning agents.
Introducing a new method that achieves minimax-optimal probably approximately correct (and regret) bounds which match the statistical worst-case lower bounds in the dominating terms for reinforcement learning.
Topology is a combinatorial property that is tricky to utilize in gradient based methods, but it is also a useful and underexploited feature of data. We present an easy-to-use TopologyLayer that allows for backpropagation through a loss based on Persistent Homology.
Looking into multiple attributes of generated text and human-evaluate multiple aspects of conversational quality, in order to investigate how effectively we can control these attributes and how these attributes affect conversational quality and chatbot performance.
We introduce the problem of real-time routing for an autonomous vehicle that can use multiple modes of transportation through other vehicles in the area. We also propose a scalable and performant planning algorithm for solving such problems.
QuizBot is an AI-powered chatbot to help college students review questions through natural-language conversations. Our experimental results suggest that educational chatbot systems may have beneficial use, particularly for learning outside of traditional settings.
When learning from humans, we typically use data from only one form of human feedback. In this work, we investigate whether we can leverage data from multiple modes of feedback to learn more effectively from humans.