Tuesday, April 6, 2010

3D Shape Scanning with a Time-of-Flight Camera

We recently developed a technique that allows to digitalize 3D objects using a Time-of-Flight camera. One has to move the camera simply around the object. Then the scans are automatically aggregated and fused into a single 3D model.
To our best knowledge, this is the first time a Time-of-Flight camera has been used to scan an objects (previously only environments have been scanned). Initially object scanning using a ToF camera seems to be out of range, due to severe noise, nonlinear measurement errors and low resolution. We overcome these obstacles with Lidarboost, our previous resolution enhancement algorithm. Here we aggregate some ten consecutive frames into a higher resolution one. The superresolved scans are then aligned using a novel algorithm, that takes into account the systematic bias of Time-of-Flight camera.
We demonstrate our algorithm on five scenes, four of them recorded by rotating the object on a turntable. The fifth scene has been recorded by free-hand motion around the objects and yields similar results as the turn-table setups, thus proving the robustness of our algorithm. We will present the results at CVPR 2010, for which we have the pre-print and the additional material already available. We plan to release the results of our method soon here.

Cite as:
title={3D Shape Scanning with a Time-of-Flight Camera}, author={Cui, Yan, and Schuon, Sebastian and Derek, Chan and Thrun, Sebastian and Theobalt, Christian},
journal={In Proc. of IEEE CVPR 2010}, year={2010}}

(This was joint work with Yan Cui, Derek Chan, Sebastian Thrun and Christian Theobalt)

Wednesday, April 29, 2009

Superresolution in 3D - Lidarboost

Thinking along the lines of our last year's CVPR contribution, we developed an algorithm - Lidarboost - that uses multiple shoots of the same scene to significantly enhance the quality of the 3D model of the scene. Based on a newly designed smoothness prior, we recover great details while keeping the level of noise down. To quantify the improvement, we not only show data captured with Swissranger SR3000 cameras, but also worked on synthetic data, where the gain in resolution is both visually and numerically exemplified. Download the pre-print along with some additional material here. Here's the poster we had on CVPR 2009 and our demo video:

Cite as:

title={LidarBoost: Depth Superresolution for ToF 3D Shape Scanning}, author={Schuon, Sebastian and Theobalt, Christian and Davis, James and Thrun, Sebastian},
journal={In Proc. of IEEE CVPR 2009}, year={2009}}

(This was joint work with Christian Theobalt, James Davis and Sebastian Thrun)

Wednesday, April 15, 2009

Motion Deblurring (revisited)

The previous work mon motion deblurring, namely comparing several methods with real captures images with well known properties has recently been accepted as a journal paper. Even though significant progress has been made since the paper was written in the field, the paper contains some good reference data, others might be interested to work with. They are available on the Motion Deblurring Page.

Cite as:
title={Comparison of Motion Deblur Algorithms and Real World Deployment},
author={Schuon, Sebastian and Diepold, Klaus},
journal={Acta Astronautica},

Tuesday, November 11, 2008

Truly Incremental Locally Linear Embedding

Locally Linear Embedding (LLE) has been proposed by Saul and Roweis, and is an algorithm for nonlinear dimensionality reduction, belonging to the family of (unsupervised) manifold learning techniques. Manifold learning is particularly interesting, because of recent findings in neuroscience, where it is believed to be similar to the human brain’s learning process. In The Manifold Ways of Perception Sueng et al. describe this new hypothesis and LLE is suggested as one possible algorithm.
Unfortunately, no accurate incremental formulation of LLE was available to date. This is desired, if one is to use LLE in a robot, where data is coming in continuously (i.e. from a camera) and the believe has to be updated fast and regularly. Our re-formulation of the algorithm makes use of properties of Eigensolvers and achieves up to 100x speed up.

Download the paper here.
(This was joint work with Marko Durkovic, Klaus Diepold, Jürgen Scheurle and Stefan Markward)
Cite as:
title={Truly Incremental Locally Linear Embedding},
author={Schuon, Sebastian and Durkovic, Marko and Diepold, Klaus and Scheuerle, Jürgen and Markward, Stefan},
journal={1st International Workshop on Cognition for Technical Systems},

Saturday, September 6, 2008

Superresolution For ToF-Depth Cameras

Multiple manufacturers started shipping these nice, shiny Time-of-Flight (ToF) depth cameras. They look like a webcam, but they actually return depth measurements. And they do return them at 30fps and full resolution. Unfortunately the resolution is also quite webcam like - the best model achieves 320x240, but most camera's resolution is way below.
We looked into increasing the resolution by applying a technique called superresolution, well known for color cameras. The key idea is to take multiple, slightly displaced shots from a scene and fusion them back together to one high resolution recording. Our approach will not only increase resolution, but also reduce noise, another issue with depth cameras. We presented the idea recently at CVPR'08, so check out the paper or see the poster:

(This was joint work with Christian Theobalt, James Davis and Sebastian Thrun)

Cite as:
title={High-quality Scanning using Time-Of-Flight Depth Superresolution},
author={Schuon, Sebastian and Theobalt, Christian and Davis, James and Thrun, Sebastian},
journal={CVPR Workshop on Time-of-Flight Computer Vision 2008},

Monday, April 7, 2008

Human Tetris

Being inspired by the videos of a Japanese TV show called Human Tetris (available via youtube), we (Martin Davidsson and me) created a computer game version, so everyone around the world could join the fun! To see what the game is like, watch this demo movie:

For those of you, who are interested in the technology behind, we use a webcam and use adaptive background subtraction, to figure out which pixel is part of the silhouette and which is not. This worked astonishing well, but for some colors that were very much alike the background. If you want to play a round yourself, code is available at Google code, or if you want to save yourself the trouble of compiling, contact me for binaries!

Sunday, April 6, 2008

Automated Photo Tagging in Facebook

The web 2.0 community Facebook offers among others, the possibility to upload images to photo albums. On these pictures persons can be tagged, meaning their position in the image is marked. Currently users have to do this manually, which is a cumbersome task.
We looked into, whether this task could be performed by a computer. The presented automatic facial tagging system is split into three subsystems: obtaining image data from Facebook, detecting faces in the images and recognizing the faces to match faces to individuals. Firstly, image data is extracted from Facebook by interfacing with the Facebook API. Secondly, the Viola-Jones’ algorithm is used for locating and detecting faces in the obtained images. Furthermore an attempt to filter false positives within the face set is made using LLE and Isomap. Finally, facial recognition (using Fisherfaces and SVM) is performedon the resulting face set. This allows us to match faces to people, and therefore tag users on images in Facebook. The proposed system accomplishes a recognition accuracy of close to 40%, hence rendering such systems feasible for real world usage.

Download project report (with Harry Robertson, Hao Zou)
Cite as:
title={Automated Photo Tagging in Facebook},
author={Schuon, Sebastian and Robertson, Harry and Zou, Hao},
journal={Stanford CS229 Fall 2007 Project Report},

Saturday, April 5, 2008

Head Motion Controlled Break Out

In an attempt to create new interfaces for computers, I experimented with webcams (as part of Designing Applications that See). In this instance, the paddle of the classic break out game can be moved by simply moving your head. This creates an astonishing simple approach in controlling the game.
One can think of several techniques to track the head of the user, but I found a very simple one to work fast (realtime and low cpu usage is a huge concern for a game) and reliable: simply thresholding the image for bright pixels and then averaging their position to a mean position, assumed to be the head. This position might by no means be really the head, since other bright objects might be in the view of the camera. But these are normally static, hence the only way to change the mean position is by moving the (illuminated) head. The amount the paddle is moved by a certain head movement might be different from background to background, but the human can adapt to that intuitively.

To see the technique in action, see this video:

Machine Learning: Locally Linear Embedding

This work goes into the field of machine learning. So far most learning techniques needed supervision and used linear models to represent the data. The learning algorithm evaluated in this work is unsupervised and of non-linear nature. It was first proposed in 2000, but has so far only been evaluated by computer science and mathematics researchers. During my work I focused on some implementation issues important to engineers. Furthermore some new incremental formulations were proposed.

Download Report (Bachelor Thesis)

HDR-Imaging for Welding

For this project I exploited the idea, that a single color filter (here a red filter), will lead to different attenuation factors on the color channels of a Bayern pattern camera sensor. Given that we are only interested in the gray scale information of a scene, we can use this fact to capture impressiv high dynamic range (HDR) images.

Original Scene

Resulting HDR Image

The image on the left shows the scene captured by a standart camera. Here the scene comprimises both a bright light source and some text on the right. Using the suggested HDR technique, we can craft an image which contains both the detailed structure of filament of the light source and the text in readable form. Below you can see the four images channels of the raw bayer image:

For other details on the project, see the Visible Welding Homepage for details.

Motion Deblurring

If a camera moves fast while taking a picture, motion blur is induced. There exist techniques to prevent this effect to occur, such as moving the lens system or the CCD chip electro-mechanically. Another approach is to remove the motion blur after the images have been taken, using signal processing algorithms as post-processing techniques. For more than 30 years, numerous researchers have developed theories and algorithms for this purpose, which work quite well when applied to artificially blurred images. If one attempts to use those techniques to real world scenarios, they mostly fail miserably. In order to study why the known algorithms have problems to de-blur naturally blurred images we have built an experimental setup, which produces real blurred images with defined parameters in a controlled environment.

For more details visit the separate Motion Deblurring Page.

Cite as:
title={Comparison of Motion Deblur Algorithms and Real World Deployment},
author={Schuon, Sebastian and Diepold, Klaus},
journal={57th International Astronautical Congress},

MOKE: Images of Micromagnets

A way to illustrate the magnetic field on surfeces, the Magneto-Optic Kerr Effect can be used. During my time at the Max-Plank-Institute for Microstructure Physics, Halle, I established an experiment to record movies of the domain structure magnitisation of certain material.

The first video clip shows the process of establishing domain structures by exposing the target to a time-variant magnetic field. Here the target was a Whisker. The lumminance of the target corresponds directly to the magntic field strength. The images captured have been overlayed in software by vectors of the magnetic flux direction.

In this second clip the target was a cupper sheet. Contrary to the Whisker we observe a different domain pattern (a stacked one compared to the Landau-Lifshitz structure before). Furthermore defects in the target material lead to earlier domain forming.

The Project Report and a Poster are unfortunatly available in German only.

Intelligent Weather Station

A legacy thing, back from highschool. Since that was in Germany, the docs are also German...

"Als Ziel habe ich mir eine Wetterstation gesetzt, welche selbstständig arbeitet und sämtliche Messwerte digital erfasst. Alle Sensoren sollen mittels eine 1-Wire-Netzes untereinander vernetzt werden, damit ein Computer die Messwerte verarbeiten kann. Anhand dieser Werte soll die Software eigenständig Entscheidungen treffen und gegebenenfalls Warnmeldungen ausgeben.

Das Ergebnis dieser Überlegungen ist eine Wetterstation mit Feuchte-, Niederschlags-, Temperatur-, Luftdruck-, Windgeschwindigkeits-, Windrichtungs- und Sichtweitensensoren. In Arbeit befinden sich im Moment weitere Sensoren zur Erfassung des Grundwasserpegels, der Zustände der Feuermelder und der Dachflächenfenster. Eine Kombination der Sensordaten ermöglicht der Software Warnmeldungen an den Benutzer („Es regnet, bitte Fenster schließen“ oder „Vorsichtshalber Pumpen im Keller installieren, Grundwasserspiegel ist gefährlich hoch“). Diese Meldungen können entweder per Internet oder SMS zum Benutzer gelangen. Außerdem bietet die Software ein Web-Interface zur Auswertung aller Sensordaten. Ebenfalls können die Sensordaten in Standardformaten exportiert werden um z.B. Klimadiagramme zu erstellen."

Download Report (German only)


This blog is to supersede my old research page, hoping it will be more up to date than ever! But only time will tell that.
For now, I'll copy old stuff and add a little new stuff, but expect to see more soon. Happy browsing!