Change the future

Wednesday 1:20 p.m.–4:40 p.m.

Learn Python Through Public Data Hacking

David Beazley

Audience level:
Core Python (Language, Stdlib)


What's more fun than learning Python? Learning Python by hacking on public data! In this tutorial, you'll learn Python basics by reading files, scraping the web, building data structures, and analyzing real world data. By the end, you will have set up your Python environment, installed some useful packages, and learned how to write simple programs that you can use to impress your friends.


This is a Python tutorial with a practical bent. The goal is not to cover the whole language, but to learn the basics through some interesting and engaging examples.

  1. Install Python and hack the transit system. In this first part, we'll install Python and get right down to business by hacking real-time transit feeds. We'll write a program that scrapes a web page, parses some XML, pulls out some GIS data, and puts it on a map. Find out when you can catch the next bus home.

  2. Know your city. In this example, we go exploring the data portals published by various cities (e.g., Chicago). We'll scrape some more web pages and download interesting datasets to CSV files. This might include information about rats, potholes, restaurant inspections, crimes, or anything else of notable interest. We'll talk about using Python to read and write files. Also, about how to build data structures using lists, tuples, and dictionaries and using the data to perform simple kinds of analysis.

  3. Installing packages. Python has an amazing number of add-on packages for use. We'll talk about modules, packages, and installation of third-party modules. The goal of this section is to have everyone expand their Python installation to include numpy, matplotlib, ipython, and Pandas.

  4. Boom! Bang! Wow! We'll conclude the tutorial with some slightly more advanced data processing examples that show what can be done with Python using just a few lines of code. This will probably include some plotting and more map-making (in ipython notebook).

Update: See updated tutorial preparation instructions at Learn Python Through Public Data Hacking