# Graph Analysis from the Ground Up

Type:
Tutorial
Audience level:
Novice
Category:
Science
March 7th 1:20 p.m. – 4:40 p.m.

### Description

Graphs are a fundamental datatype - but typical developers don't get as much exposure to using and working with graphs as with other datatypes like tables and queues. This is a from-the-ground up working session; by the end, attendees should have the tools and experience to model and analyze problems with graphs.

# Summary

This tutorial is intended to bring somebody with Python experience but limited or no experience using graph-based algorithms to a place where they:

1. Understand the basics of graph theory and why it can be helpful;
2. Are familiar with the available tools for dealing with graphs;
3. Recognize how to model a problem in terms of a graph; and
4. Have a first hands-on experience applying the theory and the tools to solve an interesting real-world problem.

To do this, the tutorial is divided into four sections, each corresponding to one of the objectives above. Each portion will have a hands-on exercise pertaining to the exact subject, with part 4 as a crowning workshop bringing together various skills and points raised throughout the session; after having a few minutes to work on their own code and ask questions, the class as a whole will walk through a solution.

# Part 1: Graph Theory (50 mins)

• Theoretical background (15 mins): What are graphs? Where do they show up? Why do we care?
• Reasoning about graphs (35 mins): Nodes, edges, cycles, and traversals; operations on graphs.
• Practical exercise: Path operations and the filesystem tree

# Part 2: Graph Tools (35 mins)

• Representing graphs in Python (15 mins)
• Dict-based representation
• NetworkX, iGraph, etc.
• Reading and writing graphs (10 mins)
• In-memory representations
• On-disk representations
• Graph databases
• Visualizing graphs (10 mins)
• Matplotlib, Gephi, etc.
• Practical exercises:
• Converting the filesystem tree to use a graph representation
• Using the same representation to model a map

# Part 3: Graphs in Practice (35 mins)

• Recognizing word problems that suggest a graph-based solution (5 mins)
• Problem modeling (15 mins): What are your nodes? What are your edges?
• Performing computations on graphs (15 mins): Graph algorithms in code; Gremlin; flow-based programming
• Practical exercises:
• Build a graph representing a module dependency graph
• Answering questions by traversing the graph
• High-level visualization

# Part 4: Hands-on with Graphs (50 mins)

• Problem set: Analysis of Python-Dev
• Data cleaning (10 mins)
• Graph construction (10 mins)
• Centrality (10 mins)
• Clique analysis (10 mins)
• Visualization (10 mins)