Wednesday 9 a.m.–12:20 p.m.
Practical Graph/Network Analysis Made Simple
- Audience level:
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. Here, we will explore a bike sharing data set as a way to understand the kinds of problems that can be solved using graph analytics.
In this tutorial, I will show you how you can use data to construct networks for data analysis. The goal is to demystify graph analytics and mining, and make it accessible to the general programmer. Starting with understanding a toy data set as an anchor, we will go through: * graph basics (nodes + edges, list and matrix representations), * modelling problems as graphs, * preprocessing data using Pandas, * importing data using NetworkX, * how to compute basic statistics of the network * generating visualizations using matplotlib, * finding hubs, paths and clusters in the data, * (if time permits) random graphs for statistical inference IPython notebooks and data files will be distributed beforehand on Github to facilitate code distribution. As good pedagogical practice, we will have lots of guided hands-on time, and about 30 min to 1 hour of unstructured “free hacking time” to explore a bike sharing data set (with suggested questions) in small groups of your choice of size. You will also share your IPython notebooks via Github. After the hacking time, we will showcase a select number of analyses.
No handouts have been provided yet for this tutorial