Talks: Comparing the Different Ways to Scale Python and Pandas Code

Presented by:


Description

Fugue is an open-source unified interface for Pandas, Spark, and Dask that aims to let data practitioners define their compute workflows in a scale-agnostic manner. By decoupling logic and execution, users can code in a language that they are familiar with (Python, Pandas or SQL), and then choose an execution engine to run it on (Pandas, Spark or Dask). In this talk, we cover the transform() function, which lets a user execute a single function in a distributed setting. This simple interface can be incrementally adopted and allows data practitioners to be productive with distributed computing very quickly.