(under preparation) Summary: ------- Each datum is a set of several images along with the camera location and orientation for that datum. (a) synthetic image: without any prefix, e.g. eraser0000.png (b) depthmap (range image): The depth_ image gives the depth at each pixel much like a standard grayscale depth image, except that each of the three color channels are used to show 1/3 of the range. (c) grasping point/orientation: The graspOutVec_ and the graspUpVec_ images provide object orientation at grasping location, and hence the 6-dof grasping point. (see "More details" below for convention on grasp/object orientation.) (d) Other files give information such as camera location. Code to get the data into Matlab -------------------------------- Use getCorrectedValuesStanfordData.m, which gives the data into matlab using the following function call: [image, depth, graspMaskImage, graspOutImage, graspUpImage] = getCorrectedValuesStanfordData(imageFilenumber, objectType); where, imageFilenumber = 1, 2, 3, ...,10, 20, ..., 1200, etc. objectType = 'stapler', 'thickpencil', 'mug', 'twoMug', 'tea', 'twoTeaCups', 'martini', 'cerealbowl', or 'eraser'. Other files: ----------- Each object contains a "datafile" which provides the location and orientation of the camera and grasp point in the ray tracer coordinate system. The datafile with the "Copy of" prefix is formated to be easier to read by a person, the one without the prefix is just the numbers to make it easier to read into a program. Additionally, there is a .txt file for each datum for many of the objects. This contains the endpoints of the hough line segments extracted by OpenCV. ====================================================================== More details: taken care of by getCorrectedValuesStanfordData.m ====================================================================== -------------------- Details on depthmap: -------------------- Let minDepth be the minimum depth (that is, any pixel closer than this ammount is truncated to have this minimum depth value), similarly let maxDepth be the maximum depth. Then, the red color channel ranges from 0->1 as the pixel depth ranges from minDepth->minDepth + ((maxDepth-minDepth)/3), the green from 0->1 as depth ranges from (minDepth + ((maxDepth-minDepth)/3) -> (minDepth + ((maxDepth-minDepth)*2/3), and the blue from 0->1 as depth ranges from (minDepth + ((maxDepth-minDepth)*2/3)) -> maxDepth. Thus, if minDepth = 1 and maxDepth = 4, the red channel is a depth map from 1->2, the green a depth map from 2->3 and the blue a depth map from 3->4. Although complicated to explain, this format allows greater precision than a standard grayscale format, and makes converting to a standard grayscale type depth image simple: just add together the three channels and divide by 3. (Done by the matlab .m file provided) For each object in the current dataset, minDepthValue = 0. However, maxDepthValue is different for each object, as shown below: Object maxDepthValue cerealbowl 40 thickpencil 50 martini 80 teaTwo 55 mug 90 eraser 40 Objects not listed above were not used in experiments requiring the depth image, thus none was generated. ------------------------------------------------- Details on object orientation and grasping point: ------------------------------------------------- The graspOutVec_ and the graspUpVec_ images provide object orientation ground truth labels. At each pixel in these images, if that pixel is a grasping point, the unit out vector is given in the coordinate system of the ray tracer graphics generation program. The red channel gives the X value, the green the Y value, and the blue the Z value in the following format: coordSystemValue = 2*colorValue - 1. This is done because the value of each dimension of the unit vector ranges between -1 and 1 wheras the color value must range between 0 and 1. (Done by the matlab .m file provided) This system of recording orientation using two perpendicular vectors derives from the original goal of the data usage in robotic grasping. Thus, if grabbing an object, out is the vector pointing in the same direction as the gripper end effector, up is perpendicular to this and points up. The graspPriorityWidth_ image is used to give a ground truth label for each pixel to indicate whether of not it is a grasp, with the red channel identical for all pixels considered to be part of the same grasping point. Thus, on a coffee mug the handle grasp will have one value in the red channel, wheras grasps on the rim will have another. The other two channels can be ignored. All synthetic objects except the stapler will need to be gamma shifted by 2.19 to recover the data correctly (a peculiarality of the ray tracer). That is, considering each color channel as ranging from 0->1, desiredcolorChannelValue = imageColorChannelValue ^(2.19). (Done by the matlab .m file provided) ----------------------------------- Download data at: http://ai.stanford.edu/~asaxena/learninggrasp/data/