# Graph Analysis from the Ground Up

- Type:
- Tutorial
- Audience level:
- Novice
- Category:
- Science

March 7th 1:20 p.m. – 4:40 p.m.

### Description

Graphs are a fundamental datatype - but typical developers don't get as much exposure to using and working with graphs as with other datatypes like tables and queues. This is a from-the-ground up working session; by the end, attendees should have the tools and experience to model and analyze problems with graphs.

### Abstract

# Summary

This tutorial is intended to bring somebody with Python experience but limited or no experience using graph-based algorithms to a place where they:

- Understand the basics of graph theory and why it can be helpful;
- Are familiar with the available tools for dealing with graphs;
- Recognize how to model a problem in terms of a graph; and
- Have a first hands-on experience applying the theory and the tools to solve an interesting real-world problem.

To do this, the tutorial is divided into four sections, each corresponding to one of the objectives above. Each portion will have a hands-on exercise pertaining to the exact subject, with part 4 as a crowning workshop bringing together various skills and points raised throughout the session; after having a few minutes to work on their own code and ask questions, the class as a whole will walk through a solution.

# Part 1: Graph Theory (50 mins)

- Theoretical background (15 mins): What are graphs? Where do they show up? Why do we care?
- Reasoning about graphs (35 mins): Nodes, edges, cycles, and traversals; operations on graphs.
- Practical exercise: Path operations and the filesystem tree

# Part 2: Graph Tools (35 mins)

- Representing graphs in Python (15 mins)
- Dict-based representation
- NetworkX, iGraph, etc.
- Reading and writing graphs (10 mins)
- In-memory representations
- On-disk representations
- Graph databases
- Visualizing graphs (10 mins)
- Matplotlib, Gephi, etc.
- Practical exercises:
- Converting the filesystem tree to use a graph representation
- Using the same representation to model a map

### 10 min break

# Part 3: Graphs in Practice (35 mins)

- Recognizing word problems that suggest a graph-based solution (5 mins)
- Problem modeling (15 mins): What are your nodes? What are your edges?
- Performing computations on graphs (15 mins): Graph algorithms in code; Gremlin; flow-based programming
- Practical exercises:
- Build a graph representing a module dependency graph
- Answering questions by traversing the graph
- High-level visualization

# Part 4: Hands-on with Graphs (50 mins)

- Problem set: Analysis of Python-Dev
- Data cleaning (10 mins)
- Graph construction (10 mins)
- Centrality (10 mins)
- Clique analysis (10 mins)
- Visualization (10 mins)