Tao Wang
 Stanford University

| Education | Research | Awards | Teaching | Contact | Life |

Greetings! I am a PhD student (currently on leave) at the Stanford AI Lab(SAIL). My advisor is Prof. Andrew Ng. I have been working on machine learning and robotics research since I joined Stanford in September 2009. I worked on miniature-size autonomous helicopters during my undergraduate at National Univerisy of Singapore.


  • Master of Science in Computer Science, with Distinction in Research, Stanford University, 2012
  • Bachelor of Engineering (Electrical), First Class Honors, National University of Singapore, 2009

  • Research


  • Scene Text Recognition

  • Reading text from photographs is a challenging problem with wide range of potential applications. Many recent methods have been proposed to design an end-to-end scene text recognition systems. Most of them are based on hand-crafted features and cleverly engineered algorithms. Our approch is to design machine learning-specifically, large-scale algorithms for learning the features automatically from unlabeled data, and construct highly effective classifiers for both detection and recognition to be used in a high accuracy end-to-end system. In order to gather enough training data for our system, a simple procedure that generates high quality synthetic data is devised.(See example image on the right). We also created the SVHN Dataset as a new benchmark for housenumber recognition in natural scene images.

  • Tracking UAVs with Ground-Based Cameras

  • The project aims to build an automated system to replace the human observers that detects and tracks airplanes in the vicinity of the protected UAV so as to avoid potential collisions. A high resolution DSLR camera is mounted on a pan/tilt module to automatically search for airplanes miles away. Meanwhile, in order to track the airplanes in real time, an efficient heuristic based on corner detection is developed to spot airplanes quickly in the high resolution images captured by the camera. By estimating the velocity vector of detected aircrafts, the system is able to track multiple targets while keeping on searching for other nearby aircrafts. Field tests have shown that the system is able to detect and track airplanes effectively in real time. The system provides an inexpensive and feasible way to automate safety surveillance during UAV test flights. This work was presented at the SAE AeroTech Conference 2010.

  • UAV Formation Flight

  • This work considers the task of accurate in-air localization for multiple unmanned or autonomous aerial vehicles flying in close formation. Two low-cost, electric powered, remote control trainer aircrafts with wing spans of approximately 2 meters are used. Our control software, running on an onboard x86 CPU, uses LQG control (an LQR controller coupled with an EKF state estimator) and a linearized state space model to control both aircraft to fly synchronized circles. In addition to its control system, the lead aircraft is outfitted with a known pattern of high-intensity LED lights. The trailing aircraft captures images of these LEDs with a camera and uses a recent computer vision algorithm to determine the relative position and orientation of the leading aircraft. The entire process is carried-out in real-time with both vehicles flying autonomously.

  • Indoor UAV (Undergrad Thesis Project at NUS)

  • A micro autonomous helicopter system with miniature size is designed and constructed as a test platform for indoor flight control and navigation. We adapted a coaxial radio-controlled toy helicopter into an autonomous aerial vehicle. The avionic system is based on PID control, inertial sensing and computer vision.


  • Deep Learning with COTS HPC, Adam Coates, Brody Huval, Tao Wang, David J. Wu, Andrew Y. Ng and Bryan Catanzaro. ICML, 2013. (PDF)

  • End-to-End Text Recognition with Convolutional Neural Networks, Tao Wang, David J. Wu, Adam Coates and Andrew Y. Ng. Proceedings of the Twenty-First International Conference on Pattern Recognition (ICPR 2012) (PDF)
    Oral presentation slides
    code demo
    toy character datasets (consisting of cropped characters from ICDAR 2003)
    lineBboxes.tar (pre-computed line-level bounding boxes using our best detector)
    Synthetic data (that we used to augment our training set. We also used the english subset of Chars74k dataset in our training set.)

  • Reading Digits in Natural Images with Unsupervised Feature Learning, Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng. NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011 (PDF) (The SVHN Dataset)

  • Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning, Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh, Tao Wang, David J. Wu, Andrew Y. Ng. ICDAR, 2011. (PDF) Best Student Paper Award

  • Camera Based Localization for Autonomous UAV Formation Flight, Zouhair Mahboubi, Zico Kolter, Tao Wang, Geoffrey Bower, Andrew Y. Ng. AIAA Infotech@Aerospace, 2011 (PDF) Best Student Paper Award

  • An Indoor Unmanned Coaxial Rotorcraft System with Vision Positioning, Fei Wang, Tao Wang, Ben M. Chen, Tong H. Lee. Proceedings of IEEE ICCA 2010 (PDF)
  • Top

    Honors and Awards

    1. The Christofer Stephenson Memorial Award for Graduate Research (The Best CS Masters Research Report PDF), Stanford, 2013
    2. Best Student Paper Award, ICDAR 2011
    3. Best Student Paper Award, AIAA Infotech@Aerospace 2011
    4. Siebel Scholarship 2011, by Siebel Scholars Foundation
    5. IEEE Control Systems Chapter Prize (Best Control Engineering Final Year Project), National University of Singapore, 2009
    6. Motorola Scholarship 2007 and Motorola Scholarship 2008, by Motorola Singapore
    7. Micron Innovation Award 2007, by Micron Singapore



  • Autumn 2011, CS229 Machine Learning, Teaching Assistant
  • Summer 2011, CS121 Introduction to Artificial Intelligence, Teaching Assistant
  • Autumn 2010, CS229 Machine Learning, Teaching Assistant
  • Top


    Email: twangcat(AT)stanford.edu
    Room 114, Gates Computer Science
    353 Serra Mall
    Stanford, CA 94305