Stanford Range Image Data

Home | Publications | Monocular Depth Estimation | Single Image 3-D Reconstruction | Make3d

This dataset contains aligned image and range data --- laser and stereo. Different types of examples are there---outdoor scenes (about 1000), indoor (about 50), synthetic objects (about 7000), etc.

If you have questions about the data, then contact asaxena at cs dot stanford dot edu

> >

Laser+Image data

(Used in "Learning Depth from Single Monocular Images", NIPS 2005.) (In 33 cases, images and depths are not aligned because of dynamic objects in the scene, or error in the collection process. Therefore, 425 images from datatset 1,2 and 3 were used in the NIPS and IJCV paper.)

Depth Data format:
Untar the depths (e.g. using tar -xvzf filename.tar.gz)
The files are in Matlab .mat files, you can use matlab: load filenamexxx.mat, imagesc(depthMap) to see the depthmaps.

Data alignment issues:
Depth data alignment: Depths and Images cannot be completely aligned because of equipment noise. However, affine transformation of the images/depths would help, and conversion of depths from spherical to planar coordinates would help --- more info/code later. For alignment to a first approximation, simple cropping of depthmaps and scaling will do reasonably.

Any report or publication using this data should cite its use as:
Learning Depth from Single Monocular Images, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. NIPS 2005.
3-D Depth Reconstruction from a Single Still Image, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. In IJCV 2007.

Stereo+Laser+Image Data

Image+LaserDepth+Stereo data

The depths here are raw logs from the laser scanner, in the following ascii format:
Each row represents a vertical scan. In each row, "PTLASER" tells it is from laser (true for every row). The next number, .e.g. "1130540406.855020" is the time-stamp (not needed).
Panning angle, e.g."39.874818" needed to construct 2-d map from vertical scans.
Tilt angle, e.g. "0.000000" (not used). Number of vertical scans in each row, fixed at 180,
Next 180 numbers are actual depth readings in meters for that vertical column.

Use of this data should cite:
Depth Estimation using Monocular and Stereo Cues, Ashutosh Saxena, Jamie Schulte, Andrew Y. Ng. In IJCAI 2007.
3-D Depth Reconstruction from a Single Still Image, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. In IJCV 2007.

Car Driving 1-d depth data

1-D depth data (useful for robotic applications)
Use of this data should cite:
High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning, Jeff Michels, Ashutosh Saxena, Andrew Y. Ng. In ICML 2005.
3-D Depth Reconstruction from a Single Still Image, Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng. In IJCV 2007.

Depth+Image for synthetic objects

Available here.
Use of this data should cite:
Robotic Grasping of Novel Objects, Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Andrew Ng. In NIPS 19, 2006.
Learning to Grasp Novel Objects using Vision, Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Chioma Osondu, Andrew Y. Ng. 10th International Symposium of Experimental Robotics (ISER), 2006.

Depth+Image for indoors/objects

External link to:
USF Range Image Database
Middlebury data

Note: Use of this data is free to use, as long as you cite its use in any report, presentation, code, etc. Further, no permissions are obtained from people who might be present in these images; therefore, by downloading these files, you agree not to hold the authors or Stanford University liable for any damage, lawsuits, or other loss resulting from the possession or use of files.