Social Network Analysis with Python

Maksim Tsvetovat

Type:: Tutorial
Audience level:: Intermediate
Category:: Science

March 8th 9 a.m. – 12:20 p.m.

Description

Social Network data permeates our world -- yet we often don't know what to do with it. In this tutorial, I will introduce both theory and practice of Social Network Analysis -- gathering, analyzing and visualizing data using Python and other open-source tools. I will walk the attendees through an entire project, from gathering and cleaning data to presenting results.

Abstract

SNA techniques are derived from sociological and social-psychological theories and take into account the whole network (or, in case of very large networks such as Twitter -- a large segment of the network). Thus, we may arrive at results that may seem counter-intuitive -- e.g. that Justin Bieber (7.5 mil. followers) and Lady Gaga (7.2 mil. followers) have relatively little actual influence despite their celebrity status -- while a middle-of-the-road blogger with 30K followers is able to generate tweets that "go viral" and result in millions of impressions.

In this tutorial, we will conduct social network analysis of a real dataset, from gathering and cleaning data to analysis and visualization of results. We will use Python and a set of open-source libraries, including NetworkX, NumPy and Matplotlib.

Outline:

Introduction. Why should we do this? What is the data like? Why is this different from other techniques? What can we learn?
Centralities: Degree, closeness, betweenness, PageRank, Klout Score
Beyond Klout Score: Finding communities of interest, finding clusters in networks
Information diffusion in networks -- how do things go viral?