PyCon 2016 in Portland, Or
hills next to breadcrumb illustration

Saturday 1:30 p.m.–5 p.m.

IBM: Hands-on session: Developing analytic applications using Apache Spark™ and Python

David Taieb


Are you interested in building analytic applications using Python but don't know where to start? Have you ever wanted to build Machine Learning predictive models but thought you didn't have the required background? In this hands-on session, we'll show you how to use Apache Spark with interactive IPython Notebooks to build first class applications covering a wide range of the Spark capabilities.


Distributed computing framework like Apache Spark have seen meteoric adoption amongst developers and data scientists over the last few years. What sets Apache Spark apart (beside its blazing fast speed) is its support for a large range of languages like Scala, Java, R, and of course, Python. In this tutorial, we'll cover in detail how to build world-class applications, written in Python, that combine the power of Spark with the rich set of data-centric services available on IBM Bluemix. Attendees will be able to follow the instructor as he builds end-to-end applications from publicly available GitHub repositories using interactive IPython Notebooks. The applications will cover a wide range of the Spark capabilities including Spark SQL, Spark Streaming and Spark MLLib: · Flight Predictor machine learning application that can predict whether a flight will be delayed based on weather data provided by IBM Weather Service · Insight discovery on a large set of Tweets related to car manufacturers using IBM Insight for Twitter Attendees of this hands-on tutorial should have basic experience working with the Python language. They should also understand the basic concept of machine learning although no prior experience is required.