PyCon 2016 in Portland, Or
hills next to breadcrumb illustration

Tuesday 3:15 p.m.–4 p.m.

Statistics for Hackers

Jake Vanderplas

Audience level:


Statistics has the reputation of being difficult to understand, but using some simple Python skills it can be made much more intuitive. This talk will cover several sampling-based approaches to solving statistical problems, and show you that if you can write a for-loop, you can do statistics.


The field of statistics has a reputation for being difficult to crack: it revolves around a seemingly endless jargon of distributions, test statistics, confidence intervals, p-values, and more, with each concept subject to its own subtle assumptions. But it doesn't have to be this way: today we have access to computers that Neyman and Pearson could only dream of, and many of the conceptual challenges in the field can be overcome through judicious use of these CPU cycles. In this talk I'll discuss how you can use your coding skills to "hack statistics" – to replace some of the theory and jargon with intuitive computational approaches such as simulation, sampling, shuffling, and cross-validation – and show that with a grasp of just a few fundamental concepts, if you can write a for-loop you can do statistical analysis.