(under preparation)
Summary:
-------
Each datum is a set of several images along with the camera location
and orientation for that datum.
(a) synthetic image: without any prefix, e.g. eraser0000.png
(b) depthmap (range image): The depth_ image gives the depth at each
pixel much like a standard grayscale depth image, except that each
of the three color channels are used to show 1/3 of the range.
(c) grasping point/orientation: The graspOutVec_ and the graspUpVec_ images
provide object orientation at grasping location, and hence the 6-dof grasping
point. (see "More details" below for convention on grasp/object orientation.)
(d) Other files give information such as camera location.
Code to get the data into Matlab
--------------------------------
Use getCorrectedValuesStanfordData.m, which gives the data into matlab using
the following function call:
[image, depth, graspMaskImage, graspOutImage, graspUpImage] =
getCorrectedValuesStanfordData(imageFilenumber, objectType);
where,
imageFilenumber = 1, 2, 3, ...,10, 20, ..., 1200, etc.
objectType = 'stapler', 'thickpencil', 'mug', 'twoMug', 'tea', 'twoTeaCups', 'martini', 'cerealbowl', or 'eraser'.
Other files:
-----------
Each object contains a "datafile" which provides the location
and orientation of the camera and grasp point in the ray tracer
coordinate system. The datafile with the "Copy of" prefix is formated
to be easier to read by a person, the one without the prefix is just
the numbers to make it easier to read into a program.
Additionally, there is a .txt file for each datum for many of the
objects. This contains the endpoints of the hough line segments
extracted by OpenCV.
======================================================================
More details: taken care of by getCorrectedValuesStanfordData.m
======================================================================
--------------------
Details on depthmap:
--------------------
Let minDepth be the minimum
depth (that is, any pixel closer than this ammount is truncated to
have this minimum depth value), similarly let maxDepth be the
maximum depth. Then, the red color channel ranges from 0->1 as the
pixel depth ranges from
minDepth->minDepth + ((maxDepth-minDepth)/3),
the green from 0->1 as depth ranges from
(minDepth + ((maxDepth-minDepth)/3) -> (minDepth + ((maxDepth-minDepth)*2/3),
and the blue from 0->1 as depth ranges from
(minDepth + ((maxDepth-minDepth)*2/3)) -> maxDepth.
Thus, if minDepth = 1 and maxDepth = 4, the red channel is a depth
map from 1->2, the green a depth map from 2->3 and the blue a depth
map from 3->4.
Although complicated to explain, this format allows greater
precision than a standard grayscale format, and makes converting to a
standard grayscale type depth image simple: just add together the
three channels and divide by 3.
(Done by the matlab .m file provided)
For each object in the current dataset, minDepthValue = 0.
However, maxDepthValue is different for each object, as shown below:
Object maxDepthValue
cerealbowl 40
thickpencil 50
martini 80
teaTwo 55
mug 90
eraser 40
Objects not listed above were not used in experiments requiring the
depth image, thus none was generated.
-------------------------------------------------
Details on object orientation and grasping point:
-------------------------------------------------
The graspOutVec_ and the graspUpVec_ images provide object orientation
ground truth labels. At each pixel in these images, if that pixel is
a grasping point, the unit out vector is given in the coordinate system
of the ray tracer graphics generation program. The red channel gives
the X value, the green the Y value, and the blue the Z value in the
following format:
coordSystemValue = 2*colorValue - 1.
This is done because the value of each dimension of the unit vector
ranges between -1 and 1 wheras the color value must range between 0 and 1.
(Done by the matlab .m file provided)
This system of recording orientation using two perpendicular vectors
derives from the original goal of the data usage in robotic grasping.
Thus, if grabbing an object, out is the vector pointing in the same
direction as the gripper end effector, up is perpendicular to this and
points up.
The graspPriorityWidth_ image is used to give a ground truth label
for each pixel to indicate whether of not it is a grasp, with the red
channel identical for all pixels considered to be part of the same
grasping point. Thus, on a coffee mug the handle grasp will have one
value in the red channel, wheras grasps on the rim will have another.
The other two channels can be ignored.
All synthetic objects except the stapler will need to be gamma
shifted by 2.19 to recover the data correctly (a peculiarality of the
ray tracer). That is, considering each color channel as ranging from
0->1, desiredcolorChannelValue = imageColorChannelValue ^(2.19).
(Done by the matlab .m file provided)
-----------------------------------
Download data at:
http://ai.stanford.edu/~asaxena/learninggrasp/data/