Wednesday 9 a.m.–12:20 p.m.

How to formulate a (science) problem and analyze it using Python code

Eric Ma

Audience level:


Are you interested in doing analysis but don't know where to start? This tutorial is for you. Python packages & tools (IPython, scikit-learn, NetworkX) are powerful for performing data analysis. However, little is said about formulating the questions and tying these tools together to provide a holistic view of the data. This tutorial will provide you with an introduction on how this can be done.


Are you interested in doing analysis using Python but don't know where to start? Then this tutorial is for **you**! You've probably heard about how great Pandas and iPython are for doing data analysis. However, you probably aren't sure where to get started. That was the exact same place I was when I first heard about how to do data analysis in Python. It was even more challenging having only a single undergraduate class (5 years ago) in programming and only self-taught coding experience elsewhere. In this tutorial, I aim to guide the class through the process of doing data analysis, from problem formulation to coding to deriving conclusions. I will be emphasizing one underrepresented yet crucial step that typically does not get discussed in most workshops, that is the step of translating the problem into a series of computable steps using data structures that are amenable to analysis.

Student Handout

No handouts have been provided yet for this tutorial