Talks: Cross-Server Data Joins on Slow Networks with Python

Friday - April 21st, 2023 11:30 a.m.-noon in 255DEF

Presented by:


Experience Level:

Some experience

Description

While working from home has its perks, you've found one thing missing in your remote work life: speed of network data transfer. It doesn't matter if you can write the most efficient Python data transformation code when your jobs are bottlenecked by slow data movement happening between your local laptop and remote servers.

In this talk we will address techniques for querying and joining data across distant machines efficiently with Python. We will also discuss how to handle scenarios where you need to join datasets that won't fit in your laptop's memory, including several techniques and packages for making cross server joins.

This session won't stop you from getting angry when your ISP throttles your home internet connection, but it will teach you ways to work with local and remote datasets as efficiently as possible.