ALL PROJECTS

 

Make3D: Single Image Depth Perception

Learning algorithms to predict depth and infer 3-d models, given just a single still image. Applications included creating immersive 3-d experience from users' photos, improving performance of stereovision, creating large-scale models from a few images, robot navigation, etc.

Selected Papers: NIPS'05, IJCV'07, ICCV-3dRR'07, AAAI-Nectar'08, IEEE-PAMI.
Research/Code/Data: 2005, 2007, data.
Online demo: Make3D.Stanford.edu.

 

STAIR: Mobile Robot Manipulation

Learning algorithms to predict robotic grasps, even for objects of types never seen before by the robot. Applied to tasks such as unloading items from a dishwasher, clearing up a cluttered table, fetch objects, etc.

Selected Papers: NIPS'06, IJRR'08, AAAI'08a, AAAI'08b.
Research/Code/Data: Dishwasher, Cluttered (Barrett)
Details: STAIR, Manipulation group.

 

Cascaded Classification Models: Combining Models for Holistic Scene Understanding

Holistic scene understanding requires solving several tasks simultaneously, including object detection, scene categorization, labeling of meaningful regions, and 3-d reconstruction. We develop a learning method that couples these individual sub-tasks for improving performance in each of them.

Paper: To appear in NIPS'08.
Related: Make3D.

 

STAIR: Opening New Doors

For a robot to practically deployed in home and office environments, they should be able to manipulate their environment to gain access to new spaces. We present learning algorithms to do so, thus making our robot the first one able to navigate anywhere in a new building by opening doors and elevators, even ones it has never seen before.

Selected Papers: RSS Manipulation workshop'08.
Research/Code/Data: Opening New Doors.
Details: STAIR, Manipulation group.

 

STAIR: Optical Proximity Sensors

We propose novel optical proximity sensors for improving grasping. These sensors, mounted on fingertips, allow pre-touch pose estimation, and therefore allow for online grasp adjustments to an initial grasp point without the need for premature object contact or regrasping strategies.

Selected Papers: submitted to ICRA'09.
Details: Project page, Manipulation group.

 

Make3D extension: Large Scale Models from Sparse View

Create 3-d models of large environments, given only a small number of (possibly) non-overlapping images. This technique integrates Structure from Motion (SFM) techniques with Make3D's single image depth perception algorithms.

Selected Papers: IJCAI'07, ICCV-VRML'07, AAAI-Nectar'08, IEEE-PAMI.
Research/Code/Results: here.

 

Visual Navigation: High speed obstacle avoidance

Use monocular depth perception and reinforcement learning techniques to drive a small rc-car at high speeds in unstructured environments.

Selected Papers: ICML'05, IJCV'07.
Research/Code/Data: here.
Video: Youtube.

 

Improving Stereovision using monocular cues

Stereovision is fundamentally limited by the baseline distance between the two cameras. I.e., the depth estimates tend to be inaccurate when the distances considered are large. We believe that monocular visual cues give largely orthogonal, and therefore complementary, types of information about depth. We propose a method to incorporate monocular cues to stereo (triangulation) cues to obtain significantly more accurate depth estimates than is possible with either alone.

Selected Papers: IJCAI'07, IJCV'07.
Research/Code/Data: here.

 

6-D wireless sourceless mouse

This device uses accelerometers and gyrometers to estimate its 3-d location and 3-d orientation. This device can be used, for example, to conveniently navigate in a 3-d virtual world.

Selected Papers: LNCS-KES'05.
Research page: here.
Video: wmv.

 

Noise tolerant Locally Linear Isomaps

Isomaps (for non-linear dimensionality reduction) suffer from the problem of short-circuiting, which occurs when the neighborhood distance is larger than the distance between the folds in the manifolds. We proposed a new variant of Isomap algorithm based on local linear properties of manifolds to increase its robustness to short-circuiting.

Selected Papers: LNCS-ICONIP'05.

Data-driven Robotics

The issue of what data is there to learn from is at the heart of all learning algorithms---often even an inferior learning algorithm will outperform a superior one, if it is given more data to learn from. We proposed a novel and practical solution to the dataset collection problem; we first use a green screen to rapidly collect data and then use a probabilistic model to rapidly synthesize a much larger training set. We used this data to build reliable classifiers for our robots.

Selected Papers: AAAI'08.
Research/Code/Data: here.
Video: coming soon.

 
 

Expression/Gesture Recognition

Infer facial expressions (e.g., smile, surprise, disgust, etc.) given an image of a face. This algorithm builds a sparse geometric model of face, and uses the parameters of the geometric model as features in a learning algorithm. Reasonably robust to partial occlusions. In a similar project, we use a web camera to track the hand and to infer the hand gestures for controlling a simple computer GUI. (No other equipment such as gloves were needed.)

Selected Papers: ICONIP'04.

 

Converting insulator polystyrene to moderately conducting polymer

We described a simple, bioinspired approach for the conversion of an insulator, polystyrene, to a moderately conducting polymer by introducing adenine nucleobases.

Selected Papers: Chemistry Letters'04.

 

ELifebelt: Wristworn device to save a person from electric shock

We developed a electronic device that when worn as a wrist-watch protects the person from electric shocks. It monitors the skin potentials continuously and trips the power circuit wirelessly to save the person's life.

Selected Papers: NPSC'04, Extended version.

 

Other projects

See publications page for more. E.g., speech recognition, etc.