Pandas has long used NumPy for its back-end storage. But things are changing, and the future of Pandas will likely be tied closely with PyArrow. What are Arrow and PyArrow? How do they affect Pandas users today, and how will they affect us in the future? Is PyArrow always faster than the current Pandas backend? In this talk, I introduce PyArrow, tell you what it does, how we can already use it in our Pandas work, and when it's appropriate for you to use it.
Talks
The PyArrow revolution in Pandas
Friday, May 16th, 2025 11:45 a.m.–12:15 p.m. in Ballroom A