PyCon Pittsburgh. April 15-23, 2020.

Talk: Small Big Data: using NumPy and Pandas when your data doesn't fit in memory

Presented by:

Itamar Turner-Trauring

Description

Your data is too big to fit in memory—loading it crashes your program—but it’s also too small for a complex Big Data cluster. How to process your data simply and quickly?

In this talk you’ll learn the basic techniques for dealing with Small Big Data: money, compression, batching, and indexing. You’ll specifically learn how to apply these techniques to NumPy and Pandas, but you’ll also learn the key concepts you can apply to other libraries and the specifics of your particular data.

Video

Watch on YouTube