Friday 10:50 a.m.–11:20 a.m.

All Your Ducks In A Row: Data Structures in the Standard Library and Beyond

Brandon Rhodes

Audience level:
Intermediate
Category:
Best Practices & Patterns

Description

Why are Python programmers crazy about lists and dictionaries, when other languages tout bitmaps, linked lists, and B+ trees? Are we missing out? Come learn how data structures are implemented on bare metal, how to select the right data structure, how the list and dictionary cover a wide swath of use cases, and when to dip into the Standard Library or a third-party package for an alternative.

Abstract

A data structure stores information, but also optimizes some access patterns at the expense of others. You can build a Python `list` and `set` containing the same objects, but the two containers will offer different performance depending on which operations you attempt. This talk will explore how a single fundamental data type — the array-of-bytes implemented by your computer's RAM chips — is used to create all of the others, and how each data structure's implementation optimizes it for strict ordering, easy insertion and deletion, or quick access to an arbitrary element. We will explore every container in the entire Python Standard Library as well as a few from third-party packages including NumPy. A central theme will be: where are all of the linked lists and trees? How did data structures fundamental to early computer science, and to many functional data programming languages today, come to disappear in the era of dynamic languages like Python? We will look at the roles they once filled and why garbage collection, dictionaries, and easy filtering through list comprehensions have rendered these data structures obsolete for most programmers.