Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter

Michael Danielczuk, Andrey Kurenkov, Ashwin Balakrishna, Matthew Matl, David Wang, Roberto Martín-Martín, Animesh Garg, Silvio Savarese, Ken Goldberg

This website accompanies the paper Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter, published in The International Conference on Robotics and Automation (ICRA 2019). In the future it will be updated with links to code, data, and other supplementary material.


When operating in unstructured environments such as warehouses, homes, and retail centers, robots are frequently required to interactively search for and retrieve specific objects from cluttered bins, shelves, or tables. Mechanical Search describes the class of tasks where the goal is to locate and extract a known target object. In this paper, we formalize Mechanical Search and study a version where distractor objects are heaped over the target object in a bin. The robot uses an RGBD perception system and control policies to iteratively select, parameterize, and perform one of 3 actions -- push, suction, grasp -- until the target object is extracted, or either a time limit is exceeded, or no high confidence push or grasp is available. We present a study of 5 algorithmic policies for mechanical search, with 15,000 simulated trials and 300 physical trials for heaps ranging from 10 to 20 objects. Results suggest that success can be achieved in this long-horizon task with algorithmic policies in over 95% of instances and that the number of actions required scales approximately linearly with the size of the heap.


Arxiv paper can be found here.

If you find this work useful in your research, please cite as follows:

title={Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter},
author={Danielczuk, Michael and Kurenkov, Andrey and Balakrishna, Ashwin and Matl, Matthew,  and Wang, David and Martín-Martín, Robert and Garg, Animesh and Savarese, Silvio and Goldberg, Ken},
booktitle={Proc. IEEE Int. Conf. Robotics and Automation (ICRA)},



This work is partially supported by a Google Focused Research Award and was performed jointly at the AUTOLAB at UC Berkeley and at the Stanford Vision & Learning Lab, in affiliation with the Berkeley AI Research (BAIR) Lab, Berkeley Deep Drive (BDD), the Real-Time Intelligent Secure Execution (RISE) Lab, and the CITRIS "People and Robots" (CPAR) Initiative. Authors were also supported by the SAIL-Toyota Research initiative, the Scalable Collaborative Human-Robot Learning (SCHooL) Project, the NSF National Robotics Initiative Award 1734633, and in part by donations from Siemens, Google, Amazon Robotics, Toyota Research Institute, Autodesk, ABB, Knapp, Loccioni, Honda, Intel, Comcast, Cisco, Hewlett-Packard and by equipment grants from PhotoNeo, and NVidia. This article solely reflects the opinions and conclusions of its authors and do not reflect the views of the Sponsors or their associated entities. We thank our colleagues who provided helpful feedback, code, and suggestions, in particular Jeff Mahler.