Change the future

Thursday 9 a.m.–12:20 p.m.

A Gentle Introduction to Computer Vision

Katherine Scott, Anthony Oliver

Audience level:


Do you want to create a script to warp your photos, scrape your photo archive for images of cats, or create a dart turret that follows your face? This tutorial will show you how to do this and a whole lot more with computer vision. The tutorial will be suitable for all levels of developers and is a great way for python novice’s to explore the world of computer vision.


The goal of this class is to provide the students with the tools they would need to solve everyday problems with computer vision using Python. The intended audience are those would like to manipulate and process images in a meaningful way but are unsure where to start. Examples of people who could benefit from this class include robotics students and their coaches, artists and designers who want to create interactive media, weekend hackers who want to add more interactivity to their projects, and web gurus who want to intelligently filter and manipulate user generated images. The course will focus on the basics of computer vision and walk students through setting up a development environment, acquiring images, and then processing that data to achieve some objective. The course will present computer vision as a fun and interesting way to explore more mathematically challenging topics like machine learning, modeling and simulation, computer graphics, and interactivity. The course will also present iPython as a practical alternative to expensive numerical tools like Matlab.

The course will be presented as a collaboration between two expert computer vision developers: Jan Erik Solem and Katherine Scott. Jan Erik is the author of “Programming Computer Vision with Python” and is currently a computer vision researcher at Apple. Katherine Scott is a co-author of “Practical Computer Vision with SimpleCV” and the lead developer of the SimpleCV library at SightMachine.

The course will start with a brief discussion of all of the computer vision and related tools available in the python ecosystem and how they all fit together. The talk will then introduce students on how to acquire images from a variety of sources including cameras, video files, IP security cameras, and the Microsoft Kinect. We will then walk students through manipulating images to generate a desired effect and how to process an image to determine specific information. By the end of the tutorial the student will be able to find faces, colors, lines, circles, arbitrary shapes, template images, barcodes, image keypoints, text, and motion in an image. We will conclude with some introductory examples of machine learning using computer vision and some examples of basic 3D reconstruction.

We will present all material for this course as an interactive tutorial. The lecturers will write code live and the students will follow along on their personal laptops. Each unit of the workshop will conclude with one or two walk throughs of basic computer vision example application. Students will be manipulating images and writing code within the first 15 minutes of the talk and write their first computer vision application by the end of the talk. This tutorial is geared to all levels of python developers from novice to expert. While we will assume you have a basic working knowledge of python syntax and idioms (i.e. we will not explain what a for loop is), we assume no prior experience in computer vision and cover the basic theory of image data structures and algorithms. A basic familiarity with Numpy and iPython would be helpful but not mandatory. Students should come to the tutorial with a laptop and potentially a USB web camera (however a camera is not a requirement). We will provide a live Ubuntu disk with a full development environment pre-configured.

Update: See updated tutorial preparation instructions at A Gentle Introduction To Computer Vision