Wednesday 9 a.m.–12:20 p.m.

Mining Social Web APIs with IPython Notebook

Matthew Russell

Audience level:
Novice
Category:
Other

Description

Social websites such as Twitter, Facebook, LinkedIn, Google+, and GitHub have vast amounts of valuable insights lurking just beneath the surface, and this workshop minimizes the barriers to exploring and mining this valuable data by presenting turn-key examples from the thoroughly revised 2nd Edition of Mining the Social Web.

Abstract

This workshop teaches you fundamental data mining techniques as applied to popular social websites by adapting example code from _Mining the Social Web (2nd Edition, O'Reilly 2013)_ in a tutorial-style step-by-step manner that is designed specifically to accommodate attendees with very little programming or domain experience. This workshop's extensive use of IPython Notebook facilitates interactive learning with turn-key examples against a Vagrant-based virtual machine that takes care of installing all 3rd party dependencies that are needed. The barriers to entry are truly minimal, which allows maximal use of the time to be spent on interactive learning. The workshop is somewhat broadly designed and acclimates you to mining social data from Twitter, Facebook, LinkedIn, Google+, and GitHub APIs in five corresponding modules with the following memorable approach for each of them: * _Aspire_ - Set out to answer a question or test a hypothesis as part of a data science experiment * _Acquire_ - Collect and store the data that you need to answer the question or test the hypothesis * _Analyze_ - Use fundamental data mining techniques to explore and exploit the data * _Summarize_ - Present analytical findings in a compact and meaningful way Each module consists of a brief period in which each attendee will customize the corresponding notebook for the module with their own account credentials with the remainder of the module devoted to learning what data is available from the API and exercises demonstrating analysis of the data—all from a pre-populated IPython Notebook. Time will be set aside at the end of each module for attendees to hack on the code, discuss examples, and ask any lingering questions.

Student Handout

No handouts have been provided yet for this tutorial