PyCon 2019 in Cleveland, Ohio

Talks

<center> <h2>PyCon 2019 Talks Schedule</h2> <p>See the schedule in grid form <a href="/2019/schedule/talks">here</a>.</p> </center>

5 Steps to Build Python Native GUI Widgets for BeeWare

Dan Yeaw
Sunday 2:30 p.m.–3 p.m. in Grand Ballroom A

Have you ever wanted to write a GUI application in Python that you can run on both your laptop and your phone? Have you been looking to contribute to an open source project, but you don't know where to start? BeeWare is a set of software libraries for cross-platform native app development from a single Python codebase and tools to simplify app deployment. The project aims to build, deploy, and run apps for Windows, Linux, macOS, Android, iPhone, and the web. It is native because it is actually using your platform's native GUI widgets, not a theme, icon pack, or webpage wrapper. This talk will teach you how Toga, the BeeWare GUI toolkit, is architected and then show you how you can contribute to Toga by creating your own GUI widget in five easy steps.

8 things that happen at the dot: Attribute Access & Descriptors

Andy Fundinger
Saturday 2:35 p.m.–3:05 p.m. in Room 26A/B/C

We rarely think about the dot “.” between our objects and their fields, but there are quite a lot of things that happen every time we use one in Python. This talk will explore the details of what happens, how the descriptor protocol works, and how it can be used to alter the Python object model.

Account Security Patterns: How Logged-In Are you?

Philip James, Asheesh Laroia
Sunday 1:10 p.m.–1:40 p.m. in Grand Ballroom B

Account security means making sure your users are only ones who can access their accounts. Account takeovers happen for a variety of reasons -- password re-use, compromised computers, guessable passwords, and more. This talk gives you concepts and concrete skills that will help you identify and prevent account takeovers and limit the damage. It’s inspired by practices in use at GitHub, Google, and the Python Package Index.

Ace Your Technical Interview Using Python

Erin Allard
Sunday 2:30 p.m.–3 p.m. in Atrium Ballroom AB

Do you feel overwhelmed by the prospect of having to find a new software engineering job because you dread the technical interviewing process? Have you been putting off submitting your job applications because you think you won't be ready to interview until you do "just one more" day of studying? In this talk, you'll learn which concepts are the most important to study for entry-level roles and how to use your Python skills to convey your understanding of these concepts with confidence and clarity.

Advanced asyncio: Solving Real-world Production Problems

Lynn Root
Saturday 12:10 p.m.–12:55 p.m. in Grand Ballroom C

Everyone’s talking about it. Everyone’s using it. But most likely, they’re doing it wrong, just like we did. By building a simplified chaos monkey service, we will walk through how to create a good foundation for an asyncio-based service, including graceful shutdowns, proper exception handling, and testing asynchronous code. We’ll get into the hairier topics as well, covering topics like working with synchronous code, debugging and profiling, and working with threaded code. We’ll learn how to approach asynchronous and concurrent programming with Python’s `asyncio` library, take away some best practices, and learn what pitfalls to avoid.

A Guide to Software Engineering for Visually Impaired

Abrar Ahmed Sheikh
Friday 3:15 p.m.–3:45 p.m. in Atrium Ballroom AB

We will look into a day in the life of a Software Engineer with limited vision to Understand their difficulties at work and how they can overcome those difficulties to become successful in their role. I am a backend Software Engineer at Yelp who uses Python extensively for building Yelps infrastructure and internal tools. I also suffer from a genetic disorder called Albinism which often results in limited visual acuity that can range from 20/120 to 20/200 in most common cases. With such a low vision it's extremely difficult to read the computer screen without the use of on-screen magnifiers. In this talk, we will see how a person with adverse visual acuity can thrive and be successful in the field of Software Engineering. We will address the importance and meaning of accessibility for Software Engineers with partial vision and recommend some best practices that are available today. We will also talk about the importance of an inclusive work culture that can help foster creativity and ease ramp up for a Software Engineer with a disability.

A Medieval DSL? Parsing Heraldic Blazons with Python!

Lady Red / Christopher Beacham
Saturday 5:10 p.m.–5:40 p.m. in Grand Ballroom C

Medieval European Nobility was obsessed with Lineage. They created a Heraldic System to track families, which assigned each family a unique Coat of Arms. Any painting of the Coat of Arms was not the official version. The official version was a "Blazon" - a precise, terse description in heraldic language. This heraldic language reads like English, Latin, French, and XML had a baby. It's a fully recursive language with a formal grammar, variable assignment, positional arguments, and also, Lions, Bears, and Pythons. Here's an example: _Sable, on a fesse or three lions gules_ In this talk, we look at parsing this Medieval _Domain Specific Language_ with Python. Along the way, we'll learn a little history, and the tools for parsing and writing your own DSL.

A New Era in Python Governance

Shauna Gordon-McKeon
Sunday 1:10 p.m.–1:40 p.m. in Grand Ballroom C

In July of 2018, Guido van Rossum stepped down as “Benevolent Dictator for Life” of Python. In December, Python core developers voted on a new governance structure to guide Python going forward. This talk explores what’s changing and how it may impact the Python community. Based on analysis of the relevant PEPs and other documentation, and strengthened via a series of interviews with core developers and other community leaders, we’ll cover Python governance past, present and future. Particular care will be taken to explain why the core developers chose the governance model they did, what the new system entails, and how community members can participate in it.

API Evolution the Right Way

A. Jesse Jiryu Davis
Friday 11:30 a.m.–noon in Grand Ballroom C

If you maintain a library, how can you innovate without breaking the projects that depend on it? Follow semantic versioning, add APIs conservatively, add parameters compatibly, use DeprecationWarnings and publish a deprecation policy, guide your users on how to upgrade, and make wise choices about when to break backwards compatibility.

A Right Stitch-up: Creating embroidery patterns with Pillow

Katie McLaughlin
Friday 10:50 a.m.–11:20 a.m. in Atrium Ballroom AB

Embroidery is an technology that dates back centuries, and still popular in the present day among craftspeople around the world. Cross-stitch refers to the creation of crosses in a grid that combine to build up an image, based on a 'chart' or pattern of the intended design. Even though entire pieces could be created based on completely manual processes, much of the technology behind automating chart creation is locked behind paid software. During this presentation, we will discuss how we can leverage the Python Imaging Library (PIL, now Pillow) in order to take source images and turn them into cross-stitch charts. The resulting art piece from the talk, the PyCon US 2019 logo, will be auctioned off at the PyLadies Charity Auction. Tickets to the Charity Auction available separately, see website for details.

A Snake in the Bits: Security Automation with Python

Moses Schwartz, Andy Culler
Friday 11:30 a.m.–noon in Atrium Ballroom AB

Security incident response is an intense, high stress, high skill job that relies heavily on human judgement. Despite that, for reasons that we can't begin to understand, a big part of an incident responder's job seems to be opening numerous browser tabs and copy-pasting bits of text from one system to another. The hard parts of incident response can't be automated, but there are entire classes of busy-work that we can eliminate with a few web hooks and some artisanal Python. In this talk we're going to discuss how to use Python to automate security incident response team (SIRT) operations. We'll give an overview of what a typical SecOps/SIRT infrastructure looks like, how and where automation fits in, and dive into some code. We'll walk through a simple example, with screenshots and code, of automating a SecOps process. We want to show that  getting started with security automation doesn't have to be difficult or expensive (though vendors will happily take your money). Just a little bit of Python can make some great quality of life improvements for incident responders.

Assets in Django without losing your hair

Jacob Kaplan-Moss
Saturday 12:10 p.m.–12:40 p.m. in Grand Ballroom A

There's one part of building a Django app I hate: setting up handling of assets and media. There are so many moving pieces — static assets, asset compilation and compression, file uploads, storage engines, etc. etc. I can never remember how it all fits together. I wrote this talk for one very selfish reason: to document how this all works, in one place, once and for all. This talk is primarily intended for Django developers who want to build an asset pipeline that Just Works. The bulk of the talk covers front-end tooling (Webpack, PostCSS, Babel, etc.), so full-stack developers of any stripe will find something here, too.

Attracting the Invisible Contributors

Charlotte Mays
Friday 4:30 p.m.–5 p.m. in Grand Ballroom C

Many new coders seek out open source projects, intending to contribute, and then get overwhelmed and leave. Project maintainers often want the help, but don’t realize how they are inadvertently appearing unwelcoming. I will discuss some of the most common complaints I’ve heard from new coders who tried to contribute but left in frustration, and ways that these can be addressed without putting too much burden on the maintainers.

Beyond Two Groups: Generalized Bayesian A/B[/C/D/E...] Testing

Eric Ma
Saturday 11:30 a.m.–noon in Grand Ballroom C

Bayesian A/B testing has gained much popularity over the years. It seems, however, that the examples stop at two groups. This begs the questions: should we not be able to do more than simple two-group, case/control comparisons? Is there a special procedure that's necessary, or is there a natural extension of commonly-used Bayesian methods? In this talk, I will use life-like, simulated examples, inspired from work and from meeting others at conferences, to show how to generalize A/B testing beyond the rigid assumptions commonly highlighted. Specifically, I will show two examples, one involving Bayesian estimation on click data on a website, and another on 4-parameter dose-response curves. There will be plenty of code from the modern PyData stack, involving the use of PyMC3, pandas, holoviews, and more.

Break the Cycle: Three excellent Python tools to automate repetitive tasks

Thea Flowers
Friday 11:30 a.m.–noon in Room 26A/B/C

Find yourself doing the same thing over and over again? Does it take more than one command to run your tests? build your docs? publish your project? deploy? It is often difficult to share your code because others can run or test it? Does your README have a series of complicated steps to get things set up? This talk explores three open-source tools that are wonderful at helping you and your project automate tasks. We'll look at Tox, which specializes in Python test environments, Nox, which offers a slightly different approach, and finally, PyInvoke, which you can use to automate just about anything.

Building a Culture of Observability

Alex Landau
Sunday 1:10 p.m.–1:40 p.m. in Atrium Ballroom AB

Observability is often thought of as just a new word for monitoring. While it encompasses traditional devops areas such as monitoring, metrics, and infrastructure management, it’s much deeper and empowers developers at all levels of the stack. **Observability is about achieving a deep understanding of your software**. This not only helps you localize and debug production issues but removes uncertainty and speculation, **empowering developers** to know their tools and improving engineering excellence. Observability helps developers “understand the narrative” of what’s going on in their software. This talk is about how we’ve driven adoption of a culture of observability within our engineering culture. We'll define and motivate for our focus on observability; discuss the tangible tools we’ve built and best practices we’ve adopted to ingrain observability into our engineering culture; and provide some specific, real-world results we’ve achieved as part of this effort. We'll will focus particularly on the tooling we’ve adopted around Django and Celery and some interesting experiences we had extending their internals.

Building an Open Source Artificial Pancreas

Sarah Withee
Friday 12:10 p.m.–12:40 p.m. in Atrium Ballroom AB

Have you ever thought about what open source software or hardware could achieve? What if it could help improve people's lives by solving some of their health problems? After the medical tech industry kept promising a system to help automatically manage insulin for type 1 diabetic people and never delivering, some people got together to find ways to do it with the tech they already had. Over the past few years, a "closed-loop" system has been developed to algorithmically regulate people's blood sugars. After reverse engineering bluetooth sensors and 915 MHz insulin pumps, the system became possible. As a diabetic, I also built this system and saw my sugar values stabilize much more than I could ever achieve doing it manually myself. Now I'm working on contributing back to the projects as well. I want to talk about this system, from a technical side as well as a personal side. I'll talk about OpenAPS (the open artificial pancreas system) and how it works, what problems it solves, and its safety and security concerns. I also want to show how it's helped me, and what this means for my health now and in the future. I ultimately want to show how we, as software developers, can change people's lives through the code we write.

Building reproducible Python applications for secured environments

Kushal Das
Saturday 5:10 p.m.–5:40 p.m. in Atrium Ballroom AB

We all have to package Python based applications for various environments, starting from command line tools, to web applications. And depending on the users, it can be installed on thousands on computers or on a selected few systems. https://pypi.org is our goto place for finding any dependencies and also in most of the time we install binary wheels directly from there, thus saving a lot time. But, Python is also being used in many environments where security is the utter most important, and validating the dependencies of project is also very critical along with the actual project source code. Many of noticed the recent incident where people were being able to [steal bticoins using a popular library](https://www.theregister.co.uk/2018/11/26/npm_repo_bitcoin_stealer/).This talk will take [SecureDrop client application](https://github.com/freedomofpress/securedrop-client) for journalists as an example project and see how we tried to tackle the similar problem. SecureDrop is an Open Source whistleblower system which is deployed over 75 news organizations all over the world. Our threat model has nation state actors as possible threats, so, security and privacy of the users of the system is a very important point of the whole project. The tools in this case are build and packaged into reproducible Debian deb packages and are installed on Qubes OS in the final end user systems. There are two basic ways we handle Python project dependencies, for most of the development work, we use a virtualenv, and directly install the dependencies using wheels built from pypi.org. When we package the application for the end users, many times we package them using a operating system based package manager and ask the users to install using those (say RPM or Debian's deb package). In the second case, all the dependencies come as separate packages (and most of the time from the OS itself). The dependency is being handled by the OS package manager itself. That case, we can not update the dependencies fast enough if required, it depends on the packagers from the community who maintains those said packages in the distribution. We use [dh-virtualenv](https://dh-virtualenv.readthedocs.io/en/1.0/) project to help us to use our own wheels + a virtualenv for the project to be packaged inside the debian .deb package. This talk will go throuh [the process](https://github.com/freedomofpress/securedrop-client) of building wheels from known (based on sha256sum) source tarballs, and then having a gpg signed list of updated wheels and [a private index](https://github.com/freedomofpress/securedrop-debian-packaging/tree/master/simple) for the same. And also how we are verifying the wheels' sha256sum (and the signature of that list) during the build process. The final output is reproducible Debian packages.

But, Why is the (Django) Admin Slow?

Jacinda Shelly
Saturday 1:55 p.m.–2:25 p.m. in Grand Ballroom C

The admin interface that comes built-in with Django is one of the most-loved (and oft-abused) features. However, early converts are often disappointed to find that the admin doesn't seem to be scaling as their database grows in size, forcing them (so they think) to switch to a custom interface much sooner than they would prefer. However, many common performance issues can be fixed with a few small configuration changes that are much easier than a rewrite! In this talk, we'll use an example project to demonstrate the most common performance pitfalls encountered when using the Django admin, and fix them - live! We'll use django-debug-toolbar, a powerful debugging interface that everyone who uses Django should be familiar with, to identify our issues and confirm that they are fixed.

Coded Readers: Using Python to uncover surprising patterns in the books you love

Eleanor Stribling
Saturday 3:15 p.m.–4 p.m. in Grand Ballroom B

We may not always know why we feel a certain way about a great story. In this talk, you'll learn how Python generates new insights into the stories you love. See how some straightforward tools and techniques have generated a rich analysis of stories from "Harry Potter" to "Wuthering Heights" to "The Three Body Problem" and the deeper questions those findings raise. Learn the straightforward techniques to do this with findings and code samples from three case studies: + Uncovering gender bias in the _Harry Potter_ series + Visualizing the use of color in Gothic Literature + Looking for narrative patterns in science fiction novels over the last 50 years This mix of qualitative and quantitative data can be used to hold up a mirror to our world and to ourselves. If you enjoy reading a great book and tinkering with code, this is the talk for you.

Code Review Skills for Pythonistas

Nina Zakharenko
Saturday 12:10 p.m.–12:55 p.m. in Grand Ballroom B

As teams and projects grow, code review becomes increasingly important to support the maintainability of complex code bases. In this talk, I’ll cover guidelines for writing consistent code, powerful linting and analysis tools for your Python code, and how to look out for common code review gotchas. You’ll also learn about style guides and how they can help make your code more consistent and easier to maintain, as well as what tools are available to help automate the review process. This talk will enable you to have better code reviews with your teams at work, as well as a better approach to code reviews in open source projects. You’ll also learn how to give code reviews with empathy by using reviews as tools for sharing knowledge instead of turning the process into a competition.

CUDA in your Python: Effective Parallel Programming on the GPU

William Horton
Saturday 5:10 p.m.–5:40 p.m. in Grand Ballroom B

It’s 2019, and Moore’s Law is dead. CPU performance is plateauing, but GPUs provide a chance for continued hardware performance gains, if you can structure your programs to make good use of them. CUDA is a platform developed by Nvidia for GPGPU--general purpose computing with GPUs. It backs some of the most popular deep learning libraries, like Tensorflow and Pytorch, but has broader uses in data analysis, data science, and machine learning. There are several ways that you can start taking advantage of CUDA in your Python programs. For some common Python libraries, there are drop-in replacements that let you start running computations on the GPU while still using familiar APIs. For example, CuPy provides a NumPy-like API for interacting with multi-dimensional arrays. Similarly, cuDF is a recent project that mimics the pandas interface for dataframes. If you want more control over your use of CUDA APIs, you can use the PyCUDA library, which provides bindings for the CUDA API that you can call from your Python code. Compared with drop-in libraries, it gives you the ability to manually allocate memory on the GPU, and write custom CUDA functions (called kernels). However, its drawbacks include writing your CUDA code as large strings in Python, and compiling your CUDA code at runtime. Finally, for the best performance you can use the Python C/C++ extension interface, the approach taken by deep learning libraries like Pytorch. One of the strengths of Python is the ability to drop down into C/C++, and libraries like NumPy take advantage of this for increased speed. If you use Nvidia’s nvcc compiler for CUDA, you can use the same extension interface to write custom CUDA kernels, and then call them from your Python code. This talk will explore each of these methods, provide examples to get started, and discuss in more detail the pros and cons of each approach.

Dependency hell: a library author's guide

Yanhui Li, Brian Quinlan
Saturday 11:30 a.m.–noon in Grand Ballroom B

Python is known for its "batteries included" philosophy but no Python developer can live without the language's rich library ecosystem. Unfortunately, as the number of libraries increases, so does the risk of cross-library incompatibilities, or "dependency hell". Dependency hell arises when two libraries have mutually conflicting requirements. These can be very difficult for developers to diagnose and may not be fixable without avoiding certain libraries entirely. After this talk, you - the library author - will have a practical set of simple best practices to follow that will allow you to build libraries that are compatible across the Python ecosystem.

Django Channels in practice

Aaron Gee-Clough
Saturday 12:10 p.m.–12:40 p.m. in Atrium Ballroom AB

Django Channels allows developers to make real-time web applications using websockets while maintaining access to the full Django batteries-included model for web applications. This talk will focus on what it takes to run a channels application in production, what's possible with Django Channels beyond chat rooms, and what pitfalls & idiosyncrasies you can expect to run into when using Channels in practice.

Does remote work really work?

Lauren Schaefer
Saturday 3:15 p.m.–4 p.m. in Grand Ballroom C

Spoiler alert: yes, remote work really does work! With nearly nine years of experience as a remote employee across three different companies, [SPEAKER] knows the ups and downs of remote work. In this session, [SPEAKER] will dive into what the research says about remote work and share their personal stories of failures and successes. You'll walk away from this session knowing why remote work is awesome, empowered to convince your boss to let you work remotely, and armed with the tools you need to be a happy, successful remote employee. If you've been thinking about making the transition to working remotely, you're a manager of people who are or could work remotely, or you've made the leap to remote work and are struggling to make it work, this is the session for you!

Don't be a robot, build the bot

Mariatta
Friday 10:50 a.m.–11:20 a.m. in Grand Ballroom B

Managing a large open source project like CPython is no easy task. Learn how the Python core team automated their GitHub workflow with bots, making it easier for maintainers and contributors to collaborate together. Even if you’re not managing a large project, you can still build your own bot! Hear some ideas on what you can automate on GitHub and personalize your bot based on your own workflow. All you need is Python. Don’t be a robot; build the bot.

Eita! Why Internationalization and Localization matter

Nicolle Cysneiros
Saturday 3:15 p.m.–4 p.m. in Room 26A/B/C

According to the always trustworthy Wikipedia, there are approximately 360 million native English speakers in the world. We, as developers, are so used to write code and documentation in English that we may not realize that this number only represents 4.67% of the world population. It is very useful to have a common language for the communication between developers, but this doesn’t mean that the user shouldn’t feel a little bit more comfortable when using your product. Translation of terms is only one step in the whole Internationalization (i18n) and Localization (l10n) process. It also entails number, date and time formatting, currency conversion, sorting, legal requirements, among other issues. This talk will go through the definition of i18n and l10n as well as show the main tools available for developers to support multiple languages and regional related preferences in their Python program. We will also see how one can enable local support for their website in Django. Finally, this presentation will discuss how we can manage Internationalization and Localization for a bigger product running in different platforms (front and back end) and how to incorporate i18n and l10n into our current development and deploy processes. Oh, and by the way, “eita!” is a Brazilian interjection to show yourself surprised with something. 🙂

Engineering Ethics and Open Source Software

Hayley Denbraver
Saturday 1:55 p.m.–2:25 p.m. in Grand Ballroom B

It seems that every week there is a news story that prompts software developers to think about the ethical implications of their work. As individuals, teams, and communities we need to consider the impact of the code we write. But what about code that we are using, but did not create? How does Open Source Software factor into this equation? OSS is used across the industry, but is written and maintained primarily by volunteers. What are best practices for maintainers, contributors, individual users, and companies who incorporate open source into their work? How can we protect ourselves and our users, while still benefiting from the innovation and collaboration of open source software?

Ensuring Safe Water Access with Python and Machine Learning

Madhav Datt
Sunday 1:10 p.m.–1:40 p.m. in Grand Ballroom A

Millions of people across the world live in a state of acute and immediate environmental crisis caused by the lack of access to safe and usable water resources, because of natural disasters, socio-economic conditions, wars and conflicts. Over the last year, I developed extremely low-cost filtration devices at intersection of machine learning and chemical-free purification. With my organization, we deployed 8,000 of these devices in rural Tanzania and brought safe water access to over 40,000 people, filtered ~14.4 million liters of water, and saved women from those affected communities a cumulative of 15,000 hours of walking every day. In this talk, we’ll learn about how, as developers, engineers and data scientists, we are incredibly well equipped to build solutions for some of the biggest challenges facing our planet and my generation. We’ll explore how all of us, as individuals and as a community, can drive on-ground social impact and help entire communities in crises, through coding, data and technology.

Escape from auto-manual testing with Hypothesis!

Zac Hatfield-Dodds
Sunday 2:30 p.m.–3 p.m. in Grand Ballroom C

If we knew all of the bugs we needed to write tests for, wouldn't we just... not write the bugs? So how can testing find bugs that nobody would think of? The answer is to have a computer *write your tests for you!* You declare what kind of input should work - from 'an integer' to 'matching this regex' to 'this Django model' and write a test which should always pass... then Hypothesis searches for the smallest inputs that cause an error. If you’ve ever written tests that didn't find all your bugs, this talk is for you. We'll cover the theory of property-based testing, a worked example, and then jump into a whirlwind tour of the library: how to use, define, compose, and infer strategies for input; properties and testing tactics for your code; and how to debug your tests if everything seems to go wrong. By the end of this talk, you'll be ready to find real bugs with Hypothesis in anything from web apps to big data pipelines to CPython itself. Be the change you want to see in your codebase - or contribute to Hypothesis itself and help drag the world kicking and screaming into a new and terrifying age of high quality software!

¡Escuincla babosa!: Creating a telenovela script in three Python deep learning frameworks

Lorena Mesa
Saturday 2:35 p.m.–3:05 p.m. in Grand Ballroom C

Telenovelas are beloved for their over the top drama and intricate plot twists. In this talk, we’ll review popular telenovelas to synthesize a typical telenovela arc and use it to train a deep learning model. What would a telenovela script look like as imagined by a neural network? To answer this question, we’ll examine three Python deep learning frameworks - Keras, PyTorch, and TensorFlow - to determine the process of translating a telenovela into a neural network and ultimately determine which one will be best for the task at hand. Be prepared for amor, pasiòn, and y el misterioso!

Everything at Once: Python's Many Concurrency Models

Jess Shapiro
Friday 2:35 p.m.–3:05 p.m. in Grand Ballroom B

Python makes it incredibly easy to build programs that do what you want. But what happens when you want to do what you want, but with more input? One of the easiest things to do is to make a program concurrent so that you can get more performance on large data sets. But what's involved with that? Right now, there are any number of ways to do this, and that can be confusing! How does `asyncio` work? What's the difference between a thread and a process? And what's this Hadoop thing everyone keeps talking about? In this talk, we'll cover some broad ground of what the different concurrency models available to you as a Python developer are, the tradeoffs and advantages of each, and explain how you can select the right one for your purpose.

Exceptional Exceptions - How to properly raise, handle and create them.

Mario Corchero
Saturday 3:15 p.m.–3:45 p.m. in Atrium Ballroom AB

Did you know there are multiple ways to raise and capture exceptions? Have you ever wondered if you should raise a built-in exception or create your own hierarchy? Did you ever find it hard to understand what an exception meant? This talk will go through the decisions needed to raise and capture exceptions when creating a library. We will look at how to translate and handle errors, create your own exceptions, and make exceptions clear and easy to troubleshoot, while also understanding how they actually work, common pitfalls.

Extracting tabular data from PDFs with Camelot & Excalibur

Vinayak Mehta
Friday 5:10 p.m.–5:40 p.m. in Grand Ballroom A

Extracting tables from PDFs is hard. The Portable Document Format was not designed for tabular data. Sadly, a lot of open data is shared as PDFs and getting tables out for analysis is a pain. A simple copy-and-paste from a PDF into a text file or spreadsheet program doesn't work. This talk will briefly touch upon the history of the Portable Document Format, discuss some problems that arise when extracting tabular data from PDFs using the current ecosystem of libraries and tools and demonstrate how Camelot and Excalibur solve this problem better and in a scalable manner. These easy-to-use packages automatically detect and extract tables from PDFs and give you access to the extracted tables in pandas DataFrames. You can also download them as CSVs or Excel files.

Fighting Climate Change with Python

Matthew Gordon
Friday 1:55 p.m.–2:25 p.m. in Room 26A/B/C

Methane, the primary component of natural gas, is a 60 times more powerful climate change agent than carbon dioxide. Current technologies for finding methane leaks in oil and gas infrastructure rely on driving well to well with a handheld camera. At Kairos Aerospace, we have developed a plane-mounted sensor for detecting methane leaks, but the sensor is only part of the solution: getting information off the sensor and into customers’ hands required us to build an entire plane-to-report pipeline. I’ll discuss the challenges we faced in developing a scalable, reliable, and cost-effective scientific computing platform in Python, with examples of novel solutions using Python’s extensive ecosystem of GIS, cloud computing and machine learning tools.

Floats are Friends: making the most of IEEE754.00000000000000002

David Wolever
Saturday 10:50 a.m.–11:20 a.m. in Grand Ballroom B

Floating point numbers have been given a bad rap. They're mocked, maligned, and feared; the but of every joke, the scapegoat for every rounding error. But this stigma is not deserved. Floats are friends! Friends that have been stuck between a rock and a computationally hard place, and been forced to make some compromises along the way… but friends never the less! In this talk we'll look at the compromises that were made while designing the floating point standard (IEEE754), how to work within those compromises to make sure that `0.1 + 0.2 = 0.3` and not `0.30000000000000004`, how and when floats can and cannot be safely used, and some interesting history around fixed point number representation. This talk is ideal for anyone who understands (at least in principle) binary numbers, anyone who has been frustrated by `nan` or the fact that `0.3 == 0.1 + 0.2 => False`, and anyone who wants to be the life of their next party. This talk will _not_ cover more complicated numerical methods for, ex, ensuring that algorithms are floating-point safe. Also, if you're already familiar with the significance of "52" and the term "mantissa", this talk might be more entertaining than it will be educational for you.

From days to minutes, from minutes to milliseconds with SQLAlchemy

Leonardo Rochael Almeida
Friday 12:10 p.m.–12:55 p.m. in Room 26A/B/C

Object Relational Mappers (ORMs) are awesome enhancers of developer productivity. The freedom of having the library write that SQL and give you back a useful, rich model instance (or a bunch of them) instead of just a tuple or a list of records is simply amazing. But if you forget you have an actual database behind all that convenience, then it'll bite you back, usually when you've been in production for a while, after you've accumulated enough data that your once speedy application starts slowing down do a crawl. Databases work best when you ask them once for (or to do) a bunch of stuff, instead of asking them lots of times for small stuff. We'll discuss how innocent looking attribute accesses on your model instances translate to sequential queries (the infamous [N+1 problem][1]). Then we'll go through some practical solutions, taken from real cases that resulted in massive speed ups. We'll cover how changes in Python code resulted in changes to the resulting SQL Queries Solutions not only for queries, but also inserts and updates, which tend to be less well documented. Though this talk focuses on SQLAlchemy, the lessons should be applicable to most ORMs in most programing languages. The ideas discussed, and solutions proposed are also valid for any storage backend, not only SQL databases. [1]: https://docs.sqlalchemy.org/en/latest/glossary.html#term-n-plus-one-problem

Getting Started Testing in Data Science

Jes Ford
Sunday 1:50 p.m.–2:20 p.m. in Grand Ballroom C

*How do you know if your data science results are correct?* Robust software usually has tests asserting that certain conditions hold, but as a data scientist it’s often not straightforward or obvious how to integrate these best practices. Our workflow includes exploration, statistical models, and one-off analysis. This talk will give concrete examples of when and how testing should play a role, and provide you with enough introduction to get started writing your first data science tests using `pytest` & `hypothesis`.

Getting started with Deep Learning: Using Keras & Numpy to detect voice disorders

Deborah Hanus, Sebastian Hanus
Saturday 4:15 p.m.–5 p.m. in Grand Ballroom A

Deep learning is a useful tool for problems in computer vision, natural language processing, and medicine. While it might seem difficult to get started in deep learning, Python libraries, such as Keras make deep learning quite accessible. In this talk, we will discuss what deep learning is, introduce NumPy and Keras, and discuss common mistakes and debugging strategies. Throughout the talk, we will return to an example project in the medical domain, which used deep learning on vocal data to determine whether a patient has a voice disorder called vocal hyperfunction.

Getting to Three Million Lines of Type-Annotated Python

Michael Sullivan
Saturday 4:15 p.m.–5 p.m. in Atrium Ballroom AB

Dropbox is a heavy user of the mypy type checker, recently passing three million lines of type-annotated Python code, with over half of that added in 2018. Type checking is helping find bugs, making code easier to under stand, enabling refactors, and is an important aid to our ongoing Python 3 migration. In this talk, we discuss how we got there. We’ll talk about what we tried in order to get our engineers to type annotate their code—what worked, what didn’t, and what our engineers had to say about it. Additionally, we’ll discuss the performance problems we faced as the size of our checked codebase grew, and the techniques we employed to allow mypy—which is implemented in Python—to efficiently check (faster than a second, for most incremental checks) millions of lines of code, which culminated in mypyc, a new ahead-of-time compiler for type-annotated Python!

Going from 2 to 3 on Windows, macOS and Linux

Max Bélanger, Damien DeVille
Friday 1:55 p.m.–2:25 p.m. in Grand Ballroom C

At Dropbox, we’ve always used Python to power our application for Windows, macOS and Linux (until recently, Python 2.7). Over the years, a growing lack of features and the need for outdated compilers/toolchains made migrating to Python 3 a necessity. Join us to hear the tale of our unique journey from Python 2 to 3 and the lessons we learned along the way: - We’ll discuss the reasons that led to our decision to make the jump. - We’ll dive into how we sequenced the transition by using the C-API to ship both versions of Python and choose one at runtime. - We’ll reveal the tools we used to enforce a hybrid (2/3) syntax for over hundreds of thousands of lines of Python code. - We’ll discuss some of our most spectacular bugs and gotchas, and how you can avoid them!

Help! I'm now the leader of our Meetup group!

Faris Chebib
Friday 3:15 p.m.–3:45 p.m. in Grand Ballroom A

After attending your local dev meetup for months, you suddenly get the dreaded email: "Your Organizer just stepped down without nominating a replacement." But the community relies on this meetup! It brings together devs from all around to engage in networking, learning, and comradery! So you step up. I mean, how hard could it be, right? Oh no. This is much harder than you thought. You have to organize a venue, figure out refreshments, get a speaker, ensure people show up. In this talk, you'll learn the skills need to start and sustain a vibrant meetup and tech community.

How to Build a Clinical Diagnostic Model in Python

Jill Cates
Friday 5:10 p.m.–5:40 p.m. in Atrium Ballroom AB

Diagnosing a patient requires consideration of a wide variety of factors: past medical history and comorbidities, physical exam findings, lab results, imaging, ECG findings, and in some cases, genomic testing. Clinical diagnosis and prognostic assessment currently relies on expert knowledge of the treating physician. Recent developments in machine learning make it possible to build automated clinical diagnostic and risk assessment tools using data from the electronic medical record. This talk walks through the steps involved in building a clinical risk assessment model, using sepsis as a case study. A large part of the talk will focus on the tools and techniques involved in pre-processing complex medical data, and strategies for evaluating model results.

How to engage Python contributors in the long term? Tech is easy, people are hard.

Victor Stinner
Friday 5:10 p.m.–5:40 p.m. in Grand Ballroom C

The CPython project is now 28 years old. It has active core developers, but almost all of them are volunteers. It's difficult to ask someone to be committed into a project for 5 years without being paid. Helping newcomers and mentoring contributors takes time and few developers are available for that. We are working on improving the diversity of CPython core developers and get more active core developers, but it's a slow process.

How to JIT: Writing a Python JIT from scratch in pure Python

Matthew Page
Friday 4:30 p.m.–5 p.m. in Room 26A/B/C

Have you ever wondered how a JIT compiler works? Production quality JIT compilers are large, complicated pieces of software that can seem inscrutable at first glance. However, building a simple JIT compiler is surprisingly easy. We'll walk through how to build a template-style JIT compiler for Python from first principles, in Python!

How to Think about Data Visualization

Jake VanderPlas
Sunday 2:30 p.m.–3 p.m. in Grand Ballroom B

The Python world has a staggering array of data visualization tools, and choosing which to use can seem like a daunting task. But which tool you use is far less important than how you use it. In this talk I’ll walk through some of the important considerations involved in visualizing your data, so that you can create more effective visualizations no matter which plotting package you use.

Instant serverless APIs, powered by SQLite

Simon Willison
Saturday 12:10 p.m.–12:55 p.m. in Room 26A/B/C

Serverless computing is all about paying only for what you use: it can scale up to handle millions of requests, but it can also scale down to 0, costing you nothing if your application is not receiving any traffic. Serverless tends to get expensive when databases are involved.... but if your data is static or changes infrequently, you can use serverless tools to provide powerful interactive APIs extremely cheaply. [Datasette](https://datasette.readthedocs.io/) is an open-source Python tool that provides an instant, read-only JSON API for any SQLite database. It also provides tools for packaging the database up as a Docker container and instantly deploying that container to a number of different serverless hosting platforms. This makes it a powerful tool for sharing interesting data online, in a way that allows users to both explore that data themselves and build their own interpretations of the data using the Datasette JSON API. In this session I'll show you how to use Datasette to publish data, and illustrate examples of the exciting things people have already built using the tool - including a number of real-world data journalism projects. I'll also teach people how to use some of the other tools in the Datasette ecosystem: * [Datasette Publish](https://publish.datasettes.com/), which allows CSV data to be published using Datasette to a serverless hosting account owned by the user, without any engineering experience required. * [csvs-to-sqlite](https://pypi.org/project/csvs-to-sqlite/), a tool for efficiently converting large numbers of CSV files into a Datasette-compatible SQLite database. * [sqlite-utils](https://sqlite-utils.readthedocs.io/), a library that lets users create complex databases from custom data feeds in just a few lines of Python code (ideal for working with Jupyter notebooks). I'll discuss the philosophy and design behind Datasette, including how immutable SQLite databases make for an impressively scalable solution for inexpensively serving complex data on the internet. Finally, I'll be exploring how Datasette takes advantage of Python 3 asyncio and the new ASGI specification.

Intentional Deployment: Best Practices for Feature Flag Management

Caitlin Rubin
Saturday 11:30 a.m.–noon in Grand Ballroom A

&nbsp;&nbsp;&nbsp;&nbsp;Feature flags can be powerful tools in mitigating risk in your development cycle — _if you use them correctly_. Failing to do so can have enormous consequences for yourself and your business. In 2012 one improperly deployed feature flag sent a $365 million company into bankruptcy in 45 minutes. So let’s talk about feature flags, specifically in how they can help us with intentional deployment. Feature flags give us a high degree of control over the features we release — but what ensures we have a high degree of control over our feature flags? &nbsp;&nbsp;&nbsp;&nbsp;In this talk, I’ll go over the best practices which will make your feature flagging program a success. The humble Feature Flag can transform into many different things: release toggle, experiment, kill switch, permissioning and more. I’ll talk briefly about the possibilities Feature Flags open up, and then describe how to use best practices of visibility and accountability to align those different flags into a cohesive feature flagging system. &nbsp;&nbsp;&nbsp;&nbsp;After this talk, you’ll know what best practices make a successful feature flagging program, and be able to implement them into your current solution to deploy faster and with less risk.

Lessons learned from building a community of Python users among thousands of analysts

I-Kang Ding, Ariel M'ndange-Pfupfu, Marina Sergeeva
Friday 4:15 p.m.–5 p.m. in Atrium Ballroom AB

Starting a few years ago, Capital One has committed to go all-in on public cloud and open source software for many of our core business operations, processes, and machine learning models. To support this transformation, we embarked on a multi-year journey to build a Python community with critical mass of users, and scale adoption of Python in our business analyst and data analyst workforces. Python has been envisioned since its early days as a programming language which can be used to "create better, easier to use tools for program development and analysis", as well as "build a user community around all of the above, encouraging feedback and self-help". [1] In our experience scaling Python adoption amongst analyst communities within a Fortune 500 company, we have found the aforementioned visions true to form - not only is Python a great first programming language for our analysts to learn, it also comes with "batteries included" and contains many of the data-related tools and libraries which allows our analysts to get productive very quickly. This talk will highlight our multi-pronged approaches to overcome organizational inertia to build a community of Python users, provide Python and OSS training, and encourage Python adoption (with mixed success). We'll share what (we think) best practices are out there, and lessons learned along the way. Reference: [1] Computer Programming for Everybody (http://www.python.org/doc/essays/cp4e.html)

Leveraging the Type System to Write Secure Applications

Shannon Zhu
Saturday 2:35 p.m.–3:05 p.m. in Grand Ballroom A

Application security remains a long-term and high-stakes problem for most projects that interact with external users. Python's type system is already widely used for readability, refactoring, and bug detection — this talk will demonstrate how types can also be leveraged to make your project systematically more secure. We'll investigate (1) how static type checkers like Pyre or MyPy can be extended with simple library modifications to catch vulnerable patterns, and (2) how deeper type-based static analysis can reliably flag remaining use cases to security engineers. As an example, I'll focus on a basic security problem and how you might use both tools in combination, drawing from our experience deploying these methods to build more secure applications at Facebook and Instagram.

Life Is Better Painted Black, or: How to Stop Worrying and Embrace Auto-Formatting

Łukasz Langa
Friday 2:35 p.m.–3:05 p.m. in Atrium Ballroom AB

What good is a code style if it's not internally consistent? What good is a linter when it slows you down? What if you could out-source your worries about code formatting, adopt a consistent style, and make your team faster all at the same time? Come hear about Black: a new code style and a tool that allows you to format your Python code automatically. In the talk you'll learn not only how the style looks like but why it is the way it is. I will do my best to convince you not only that it's good but that it's *good enough*. You'll see how you can integrate it with your current workflow and how it speeds up your life while making your code prettier on average. Lose your attachments, delegate the boring job of moving tokens around to satisfy the linter, and save time for more important matters. Guaranteed to increase the life expectancy of space bars and Enter keys on your new MacBook's keyboard.

Lowering the Stakes of Failure with Pre-mortems and Post-mortems

Liz Sander
Sunday 1:50 p.m.–2:20 p.m. in Grand Ballroom B

Failure can be scary. There are real costs to a company and its users when software crashes, models are inaccurate, or when systems go down. The emotional stakes feel high-- no one wants to be responsible for a failure. We can lower the stakes by creating spaces to learn from failures, and minimize their impact. This talk introduces two ways to address failure: blameless post-mortems, to learn from an incident; and pre-mortems, to identify modes of failure upfront.

Machine learning model and dataset versioning practices

Dmitry Petrov
Saturday 10:50 a.m.–11:20 a.m. in Room 26A/B/C

Python is a prevalent programming language in machine learning (ML) community. A lot of Python engineers and data scientists feel the lack of engineering practices like versioning large datasets and ML models, and the lack of reproducibility. This lack is particularly acute for engineers who just moved to ML space. We will discuss the current practices of organizing ML projects using traditional open-source toolset like Git and Git-LFS as well as this toolset limitation. Thereby motivation for developing new ML specific version control systems will be explained. Data Version Control or [DVC.ORG][1] is an [open source][2], command-line tool written in Python. We will show how to version datasets with dozens of gigabytes of data and version ML models, how to use your favorite cloud storage (S3, GCS, or bare metal SSH server) as a data file backend and how to embrace the best engineering practices in your ML projects. [1]: http://dvc.org [2]: https://github.com/iterative/dvc

Maintaining a Python Project When It’s Not Your Job

Hynek Schlawack
Friday 3:15 p.m.–4 p.m. in Grand Ballroom C

PyPI is a gold mine of great packages but those packages have to be written first. More often than not, projects that millions of people depend on are written and maintained by only one person. If you’re unlucky, that person is you! So how do you square delivering a *high quality* Python package you can be proud of and having only limited time at your disposal? The answer is not “try harder,” the answer is to **do less**. This talk will help you get there by talking about how you can make your life easier, remove causes of friction with your contributors, and empower said contributors to take over tasks that you can’t make time for anymore.

Making Music with Python, SuperCollider and FoxDot

Jessica Garson
Friday 1:55 p.m.–2:25 p.m. in Grand Ballroom B

Learn how to make music with Python, SuperCollider and FoxDot. We'll create a song together in this live coded adventure.

Measures and Mismeasures of algorithmic fairness

Manojit Nandi
Saturday 1:40 p.m.–2:25 p.m. in Atrium Ballroom AB

Within the last few years, researchers have come to understand that machine learning systems may display discriminatory behavior with regards to certain protected characteristics, such as gender or race. To combat these harmful behaviors, we have created multiple definitions of fairness to enable equity in machine learning algorithms. In this talk, I will cover these different definitions of algorithmic fairness and discuss both the strengths and limitations of these formalizations. In addition, I will cover other best practices to better mitigate the unintended bias of data products.

Measuring Model Fairness

J. Henry Hinnefeld
Saturday 5:10 p.m.–5:40 p.m. in Grand Ballroom A

When machine learning models make decisions that affect people’s lives, how can you be sure those decisions are fair? When you build a machine learning product, how can you be sure your product isn't biased? What does it even mean for an algorithm to be ‘fair’? As machine learning becomes more prevalent in socially impactful domains like policing, lending, and education these questions take on a new urgency. In this talk I’ll introduce several common metrics which measure the fairness of model predictions. Next I’ll relate these metrics to different notions of fairness and show how the context in which a model or product is used determines which metrics (if any) are applicable. To illustrate this context-dependence I'll describe a case study of anonymized real-world data. Next, I'll highlight some open source tools in the Python ecosystem which address model fairness. Finally, I'll conclude by arguing that if your job involves building these kinds models or products then it is your responsibility to think about the answers to these questions.

Migrating Pinterest from Python2 to Python3

Jordan Adler, Joe Gordon
Friday 12:10 p.m.–12:55 p.m. in Grand Ballroom C

Over the course of nearly a year, we migrated Pinterest's primary systems from Python2 to Python3. A large, tightly coupled codebase with over 2 million lines of code, the Pinterest codebase contained nearly every edge case that might exist in a Py2 to Py3 migration. We'll cover our approach, gotchas, and tools, and the incredible impact our migration has made on infra spend and code quality.

Mocking and Patching Pitfalls

Edwin Jung
Friday 1:40 p.m.–2:25 p.m. in Atrium Ballroom AB

Mocking and patching are powerful techniques for testing, but they can be easily abused, with negative effects on code quality, maintenance, and application architecture. These pain-points can be hard to verbalize, and consequently hard to address. If your unit tests are a PITA, but you cannot explain why, this talk may be for you. Mocking as a technique has deep roots within OOD and TDD, going back 20+ years, but many Python developers know mocks and patches merely as a technique to isolate code under test. In the absence of knowledge around OOD and TDD, best practices around mocking are completely unknown, misunderstood, or ignored. Developers who use mocks and patches without doing TDD or OOD are susceptible to falling into many well-understood and documented traps. This talk will draw a historical connection between the way mocks are taught today, and their origins in TDD, OOD, and Java. It will also demonstrate some pitfalls, and provide some guidance and alternatives to mocking and patching (e.g., dependency injection, test doubles, functional style).

Modern solvers: Problems well-defined are problems solved

Raymond Hettinger
Friday 12:10 p.m.–12:55 p.m. in Grand Ballroom B

Every programmer should learn to use solvers, tools that reason directly from a description of a problem to its solution. Tools like AlphaZero can formulate winning strategies for games given only a description of the rules of the game. For certain classes of problems, we really can just let the computer do the work. In this talk, we learn principles, techniques, and multiple examples for three solvers available in Python. The first tool is a generic puzzle-solving framework that employs tree search strategies. We apply it to a simple sequencing problem and then to a harder sliding-block puzzle. Next, we'll look at the solver code to learn how it works. I'll also show an essential optimization technique and how to humanize the output. We demonstrate our skills by solving another famous puzzle. The second tool is called a SAT solver. It is one of the miracles of the 21st century. From first principles, I'll show you what problems it solves and the way problems need to be described for modules like *PycoSAT*. I'll provide helper functions to humanize our interactions with this great tool. Then, we'll demonstrate our skills by creating a Sudoku solver and a readable logic problem solver. The third tool is the "multi-armed bandit". It is a generic reinforcement learning algorithm that is easy to learn, powerful, and applicable to a broad class of problems. We apply it to winning rock-paper-scissors using pattern recognition. Lastly, I'll summarize DeepMind's paper on AlphaZero which was published in the December 2018 edition of *Science*. This gives us hints at the full potential of these techniques. Pure Python source code and examples are provided for all of the tools.

One Engineer, an API, and an MVP: Or, how I spent one hour improving hiring data at my company.

Nicole Zuckerman
Friday 11:30 a.m.–noon in Grand Ballroom A

&lt;announcer> This one quick trick will help you measure the diversity of your hiring pipeline! Read on to hear how! &lt;/announcer> One challenge in improving diversity within a hiring pipeline is the struggle to measure what exists in the first place. It's hard to know where to focus your resources until you know what you have to work with, and can identify what steps make a difference and what efforts don't. Using python/django and an api key for our recruiting vendor, you can make this information visible and, therefore, actionable, with very little work.

Plan your next eclipse viewing with Jupyter and geopandas

Christy Heaton
Friday 10:50 a.m.–11:20 a.m. in Room 26A/B/C

Maps are powerful tools that we use every day. Python is well-equipped to handle spatial data and with well documented robust libraries to help you perform spatial analysis and create beautiful maps. In this talk, we'll discover the fascinating world of spatial analysis by solving a fun problem: where can we go to see an upcoming solar eclipse? Along the way we'll learn about mapping topics like projections and coordinate systems, best practices for map making, and intricacies of spatial data.

Plugins: Adding Flexibility to Your Apps

Geir Arne Hjelle
Sunday 1:10 p.m.–1:40 p.m. in Room 26A/B/C

Python is a flexible language. Your Python app, on the other hand, is usually more set in stone: buttons, functions, displays are all explicitly defined. In this talk you'll learn how to take advantage of features like decorators and functions as first-class objects to set up a simple plugin system that allows your app to be more flexible. In fact, you can allow your users to add or customize functionality they want after you ship. By using plugins, your code becomes more modular and maintainable. At the same time your users may be able to use your great app to work with data or challenges you didn't even know existed.

Plug-n-Stream Player Piano: Signal Processing With Python

JP Bader
Friday 4:15 p.m.–5 p.m. in Grand Ballroom A

Digital Signal Processing and Player Piano don't normally come together in the same sentance. Player Pianos that are 100+ years old are awesome artisan artifacts, but they don't play digital formats very well. This talk will show how we take a 100+ year old technology and marry it to the digital age via Python libraries and precision lasers! In this discussion we will cover how we are creating our own "Plug-n-Stream Player Piano". We will take a look at the different digital signal processing Python libraries, their functionality, and requirements for converting audio streams to piano playable audio files. After a brief walk through of our prototyped hardware, we will dissect the digital signal processing, converting streaming music to data for the Player Piano. With a real Player Piano in the room we will demo streaming music from our devices onto the piano. LIVE(ish) Piano Playing!

Practical decorators

Reuven M. Lerner
Friday 10:50 a.m.–11:20 a.m. in Grand Ballroom A

Decorators are one of Python's most powerful features. But even if you understand what they do, it's not always obvious what you can do with them. Sure, from a practical perspective, they let you remove repeated code from your callables. And semantically, they let you think at a higher level of abstraction, applying the same treatment to functions and classes. But what can you actually do with them? For many Python developers I've encountered, ecorators sometimes appear to be a solution looking for a problem. In this talk, I'll show you some practical uses for decorators, and how you can use them to make your code more readable and maintainable, while also providing more semantic power. Moreover, you'll see examples of things would be hard to do without decorators. I hope that after this talk, you'll have a good sense of how to use decorators in your own Python projects.

Programmatic Notebooks with papermill

Matthew Seal
Friday 1:40 p.m.–2:25 p.m. in Grand Ballroom A

Notebooks have traditionally been a tool for drafting code and avoiding repeated expensive computations while exploring solutions. However, with new tools like nteract's papermill and scrapbook libraries, this technology has been expanded to make a reusable and parameterizable template for execution. We'll walk though how Jupyter notebooks are being programmatically used at Netflix and how this helps with our batch processing world. We'll also explore how these use cases connect back with users and why we've adopted these tools for Python and non-Python execution.

Put down the deep learning: When not to use neural networks and what to do instead

Rachael Tatman
Saturday 3:15 p.m.–3:45 p.m. in Grand Ballroom A

The deep learning hype is real, and the Python ecosystem makes it easier than ever to neural networks to everything from speech recognition to generating memes. But when picking a model architecture to apply to your work, you should consider more than just state of the art results from NeurIPS. The amount of time, money and data available to you are equally, if not more, important. This talk will cover some alternatives to deep learning, including regression, tree-based methods and distance based methods. More importantly, it will include a frank discussion of the pros and cons of different methods and when it makes sense to use each in practice.

Python on Windows is Okay, Actually

Steve Dower
Sunday 1:50 p.m.–2:20 p.m. in Room 26A/B/C

Packages that won't install, encodings that don't work, installers that ask too many questions, and having to own a PC are all great reasons to just ignore Windows. Or they would be, if they were true. Despite community perception, more than half of Python usage is on Windows, including web development, system administration, and data science, just like on Linux and Mac. And for the most part, Python works the same regardless of what operating system you happen to be using. Still, many library developers will unnecessarily exclude half of their potential audience by not even attempting to be compatible. This session will walk through the things to be aware of when creating cross-platform libraries. From simple things like using `pathlib` rather than `bytes`, through to all the ways you can get builds and tests running on Windows for free, by the end of this session you will have a checklist of easy tasks for your project that will really enable the whole Python world to benefit from your work.

Python Security Tools

Terri Oda
Saturday 5:10 p.m.–5:40 p.m. in Room 26A/B/C

While high-level security concepts may transcend languages, each language has its own sets of tools and edge cases that are worth knowing. Python is one of many popular languages that is rarely the focus in security training, but that doesn't mean python code is automatically secure (no matter what the internet tells you). Learn why people who say “pylint will help you with security” aren’t doing you any favours, how to use Bandit for security-focused linting and talk about other options for static analysis. Take a deeper look at why scanning for publicly known vulnerabilities is complicated, and how to use Pyup Safety to make it easier. We’ll also explore some language myths and best practices.

Releasing the World's Largest Python Site Every 7 Minutes

Shuhong Wong
Saturday 10:50 a.m.–11:20 a.m. in Grand Ballroom A

Being able to release rapidly and continuously allows businesses to react to opportunities, shorten feedback loop for product iteration cycle and reduce debug effort for erroneous changes. At Instagram, we operate the world's largest fleet of servers running on Python and we continuously deploy every X minutes. Anyone can do it, this talk will teach you the practical steps and talk about the ideas and problems we faced at every phase of our automation journey.

Rescuing Kerala with Python

Biswas B
Friday 5:10 p.m.–5:40 p.m. in Room 26A/B/C

In the month of August 2018, Kerala, the southernmost state of India, received 250 % of normal rainfall, resulting in all of its 44 dams to be opened. Over 483 people died due to the flooding caused by the opening of dams and a million people were evacuated. I started a website ([keralarescue.in][1]), written in Django. The main purpose of the site was effective collaboration and communication between authorities, volunteers and public. The site was open source from Day 0. About 1500 developers and volunteers onboard our slack group in a couple of days. Within a week, the community united to forge a critical piece of software that saved thousands of lives. The site initiated as a portal for refugees to request essential resources like food and water and for volunteers to see their needs, all sorted by geographical location. Additionally, we provided direct information for the government and became the official website later on. The Minimum Viable Product was delivered in fourteen hours. In the initial days, it was only used by the volunteers and Point of Contacts assigned by the government. Later, when the situation became critical, we started getting rescue requests from stranded refugees. The Github repo of the website went viral, and we started to receive feature requests rapidly. We received more than five hundred pull requests in the span of three weeks. The story I want to present is about the community and technical aspects of keralarescue.in, how people from different backgrounds came together to build a critical piece of software that saved many lives. [1]: https://keralarescue.in

Scraping a Million Pokemon Battles: Distributed Systems By Example

Duy Nguyen
Friday 5:10 p.m.–5:40 p.m. in Grand Ballroom B

I love Pokemon. However, I don't love how some players make the community less welcoming towards beginners by hiding their strategies. So I did what any defiant engineer would. I signed up for a free AWS account and began (responsibly) scraping millions of their unauthenticated Pokemon battles. We'll journey together through this passion project of mine and draw on specific examples to better understand the trade-offs of working with distributed systems or microservice architectures in the cloud.

Set Practice: learning from Python's set types

Luciano Ramalho
Friday 10:50 a.m.–11:20 a.m. in Grand Ballroom C

Key takeaways: 1. Set operations enable simpler and faster solutions for many tasks; 1. Python's set classes are lessons in elegant, idiomatic API design; 1. A set class is a suitable context for implementing operator overloading. Boolean logic and set theory are closely related. In practice, we will see cases where set operations provide simple and fast declarative solutions to programming problems that otherwise require complicated and slow procedural coding. Python's set built-ins and ABCs provide a rich and well designed API. We will consider their interfaces, and how they can inspire the creation of Pythonic APIs for your own classes. Finally, we will discuss operator overloading — a technique that is not suitable everywhere, but certainly makes sense with sets. Taking a few operators as examples, we will study their implementation in a new `UintSet` class for integer elements. `UintSet` fully implements the `MutableSet` interface over a totally different internal representation based on a bit array instead of a hash table. Membership tests run in _O(1)_ time like the built-in sets (however, `UintSet` is currently pure Python, so YMMV). Using bit arrays allow core set operations like intersection and union to be implemented with fast bitwise operators, and provides compact storage for dense sets of integers.

Statistical Profiling (and other fun with the sys module)

Emin Martinian
Saturday 2:35 p.m.–3:05 p.m. in Atrium Ballroom AB

Profiling involves computing a set of data about how often and how long various parts of your program are executed. Profiling is useful to understand what makes your program slow and how you can improve it. After a quick review of deterministic profiling tools and techniques, I will describe how you can do statistical profiling with existing packages or write your own from scratch. Statistical profiling involves occasionally sampling what your program is doing instead of watching each line or function. A key feature of statistical profiling is that by using a moderate sampling frequency, you can profile your production code with almost no overhead. This lets you find the actual bottlenecks in real use cases. The core technical focus of the talk is python's sys module and how it lets you easily examine a running program. I also describe some tricks to be aware of related to threading, context switches, locks, and so on. At the conclusion of the talk, you will hopefully understand how to use an existing statistical profiler or write a customized version yourself.

Strategies for testing Async code

Neil Chazin
Sunday 1:50 p.m.–2:20 p.m. in Grand Ballroom A

Testing code is important. Testing, primarily unit-testing async code requires heading off the the standard roadway of unit testing in python. This talk will provide a map to help you along the new path towards testing async code. Topics include: - a brief intro to `asyncio` and challengs in testing with it - running coroutines (and other awaitables) under test - mocking coroutines - testing "main" `asyncio` loops

Supporting Engineers with Mental Health Issues

Jenna Quindica
Friday 4:30 p.m.–5 p.m. in Grand Ballroom B

People live with mental health stigma because we learn that we're supposed to be strong and resilient. It's okay not to be strong or resilient all the time. Discussing mental illness is uncomfortable. In this talk, I will help you overcome that discomfort by examining the most common mental health issues, how you can get help for yourself, and how you can best support your coworkers, friends, and family. No one should have to deal with mental illness alone. Bring your tissues.

Syntax Trees and Python - Automated Code Transformations

Joe Gordon
Saturday 4:30 p.m.–5 p.m. in Room 26A/B/C

Manually updating a million line code base is tedious. Thankfully syntax trees provide a safe and quick way to automatically apply repetitive transformations. Leveraging syntax tree based tooling (based on lib2to3), has been a critical component of Pinterest's Python 3 upgrade strategy, and saved us countless hours of work. Learn how syntax trees work, how they are used to transform code, and how you can quickly write your own transformations.

Take Back the Web with GraphQL

Robert Myers
Friday 2:35 p.m.–3:05 p.m. in Grand Ballroom C

GraphQL is an exciting technology that can help simplify web logic. Most of the attention has been focused on client-side improvements, such as reducing payload sizes and reducing total number of requests. This talk will show how GraphQL can structure your backend logic to reduce the client-side dependencies or remove them entirely!

Terrain, Art, Python and LiDAR

Andrew Godwin
Friday 11:30 a.m.–noon in Grand Ballroom B

Seeing the Earth from above is truly breathtaking, but it takes a lot of time, fuel and opportunity - so instead, why not make miniature art of the world's famous terrains? This talk explores using Python to take raw terrain data - from aerial lidar and space-based radar scans - and processing it into 3D models, and CAD/CAM toolpaths, with the ultimate result of making Python-powered artwork of some of Earth's natural wonders. See how to reduce each National Park to a small, intricately-milled metal carving, how to laser-cut a side-on relief of a whole Hawaiian island, or how to 3D print tiny versions of cities where you can make out each individual building - and the strengths and challenges of using Python to handle 3D and GIS data. We'll also look at some basic 3D modelling code, discuss the wonders of different map projections, and how personal LiDAR is slowly, but surely, becoming affordable.

The Black Magic of Python Wheels

Elana Hashman
Saturday 11:30 a.m.–noon in Room 26A/B/C

If you’ve ever `pip install`ed a Python package with C extensions on Linux, it was probably a painful experience, having to download and install development headers for libraries you’ve never even heard of. Maybe you’ve given up on pip and have switched to Conda. But it doesn’t have to be this way! The Python Packaging Authority has been working hard to solve this problem with a new distribution format for compiled Python code, called “wheels.” In this talk, we’ll descend into the practice of PEPs 513 and 571: arcane scrolls that can equip Python developers with spells to pre-compile applications and libraries in a way that allows most Linux end users to run them directly. I’ll show you how to hex compiled artifacts and source code into the wheel format, harness application binary interfaces (ABIs) to use external libraries, brave the eldritch horrors of the dynamic linker, and bind these all together in the manylinux environment. Come learn to harness the black magic of Python wheels, and you too can spare your users pain… for a price.

The Perils of Inheritance: Why We Should Prefer Composition

Ariel Ortiz
Saturday 4:30 p.m.–5 p.m. in Grand Ballroom B

Inheritance is among the first concepts we learn when studying object-oriented programming. But inheritance comes with some unhappy strings attached. Inheritance, by its very nature, tends to bind a subclass to its superclass. This means that modifying the behavior of a superclass might alter the behavior of all its subclasses, sometimes in unanticipated ways. Furthermore, it’s commonly accepted that inheritance actually breaks encapsulation. So, if inheritance has these issues, what alternative do we have? More than two decades ago, The Gang of Four (Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides) suggested in their famous _Design Patterns_ book that we should favor object composition over class inheritance. In this talk I will show some code examples in Python where inheritance goes astray and demonstrate how to correct them by using composition. My intention is not to demonize inheritance, but instead present how to use it wisely in order to improve the design of our object-oriented software so that it’s more flexible and easier to maintain.

The Refactoring Balance Beam: When to Make Changes and When to Leave it Alone

Amanda Sopkin
Sunday 1:50 p.m.–2:20 p.m. in Atrium Ballroom AB

Many developers struggle to find the balance between striving to improve existing code and letting good enough alone by accepting certain shortcomings. As a new developer to a team it can be difficult to understand existing strategies and patterns that are sometimes flat out bad (and often openly acknowledged as such). Often the result of tight deadlines or unclear specifications, even the best developers write code they later look back upon with shudders. So how do we decide when refactoring is worth it? Come learn strategies for refactoring with minimal impact, methods for working with bad code you can’t change, and strategies for knowing the difference between what is fixable and what is better left alone.

The Zen of Python Teams

Adrienne Lowe
Saturday 10:50 a.m.–11:20 a.m. in Grand Ballroom C

The Zen of Python, accessed by running `import this`, is a list of nineteen aphorisms that have guided the development of the language. It has good advice for how to organize our code, but what does it have to say about how we organize ourselves? Plenty: the Zen of Python is not only a solid set of development principles, but the other easter egg is that it’s packed with wisdom about how to build healthy teams. In this talk I draw upon my time as an engineering manager of Python-focused engineering teams to tell stories of what the Zen of Python has to teach us about communication and conflict, building inclusive teams and transparent processes, and promoting psychological safety. Come ready to reflect on and feel inspired by a new interpretation of these principles, and bring what you learn back to your meetup, study group, open source project, or team.

Things I Wish They Told Me About The Multiprocessing Module in Python 3

Pamela McANulty
Saturday 4:30 p.m.–5 p.m. in Grand Ballroom C

If you haven't tried multiprocessing or you are trying to move beyond `multiprocessing.map()`, you will likely find that using Python's `multiprocessing` module can get quite intricate and convoluted. This talk focuses on a few techniques (starting, shutting down, data flow, blocking, etc) that will maximize `multiprocessing`’s efficiency, while also helping you through the complex issues related to coordinating startup and _especially_ shutdown of your multiprocess app.

Thinking Inside the Box: How Python Helped Us Adapt to An Existing Data Ingestion Pipeline

Eddie Schuman
Saturday 10:50 a.m.–11:20 a.m. in Atrium Ballroom AB

We will cover how we used Python to adapt to a large institutional processing setup. We used Python to create the definitions, configuration files, and supplementary metadata for each of the weather radars we worked with. We used a variety of custom tools to interface with existing systems and processes that would have been infeasible to work with otherwise. We took advantage of one of Python’s greatest strengths: its flexibility. We used it to perform the bulk of our data processing with NumPy, created custom utility functions to encourage code reuse, and created custom scripts for interfacing with the institutional data processing framework we worked within.

Thinking like a Panda: Everything you need to know to use pandas the right way.

Hannah Stepanek
Friday 3:15 p.m.–4 p.m. in Grand Ballroom B

Using the pandas python library requires a shift in thinking that is not always intuitive to those who use it. This talk will take a deep dive into the underlying data structure of pandas to explain why it performs the way it does under certain circumstances. It will explain why a MultiIndex DataFrame takes up less memory than it's simple counter part, why groupby should never be run on a non-MultiIndexed DataFrame, why the example documentation for the pandas apply function is an example of how not to use it, and how not taking the time to normalize data can affect performance.

Thoth - how to recommend the best possible libraries for your application

Fridolín Pokorný
Saturday 11:30 a.m.–noon in Atrium Ballroom AB

Having libraries in your Python project properly locked to a specific version is a well known best practice. Dependency management tools in the Python ecosystem lock dependencies to the latest version available, but what if the latest version available is not the best fit for your application? Open source project Thoth is an advanced Python dependency resolver which recommends libraries for your project based on observations that are gathered for Python libraries for specific runtime environments. How these recommendations look like? How are different observations like performance characteristics of machine learning libraries for a particular hardware gathered?

Time to take out the rubbish: garbage collector

Pablo Galindo Salgado
Saturday 1:40 p.m.–2:25 p.m. in Grand Ballroom A

One of the reasons why programming in Python is very straightforward and simple is that we do not have to worry about the lifetime of our objects. That is, once it ceases to be necessary, a variable disappears from the memory "magically". The fact that this happens automatically can erroneously lead us to believe that it is not required to worry about what happens behind the scenes. Nothing is further from reality: knowing how Python manages memory is fundamental in specific scenarios, and not knowing what is happening can have consequences as significant as unpleasant. For example, if our programs manage a large amount of data at the same time or launch multiple processes in parallel, this ceases to be a theoretical issue and becomes something that we, logical minds, also care. Although these concepts tend to be considered advanced and difficult to understand, we will see that this is not the case. This topic is not a purely theoretical matter nor is it difficult to find its practical applications. In this talk, we will explain why it is something that should matter to us, and we will talk about how to apply the knowledge we have gained to specific problems.

to GIL or not to GIL: the Future of Multi-Core (C)Python

Eric Snow
Friday 2:35 p.m.–3:05 p.m. in Grand Ballroom A

Why come to yet another talk about CPython's GIL? [1] Sure, we'll spend a little time on what it is, who it affects (and doesn't), and how to work around it. However, what you want to come hear is what the future holds for the GIL. We'll take most of the time talking about life *after* the GIL! Come see what recent developments and ongoing work will allow us to either circumvent the GIL and get rid of it, unlocking true multi-core capability in Python code. [1] In case you don't know, the GIL is a global lock that prevents multi-core parallelism in pure Python code. It has a controversial place in the community. Look it up (or come to this talk)!

Type hinting (and mypy)

Bernat Gabor
Saturday 2:35 p.m.–3:05 p.m. in Grand Ballroom B

Type hinting for Python (as a linter tool) came out in September 2015 as part of Python 3.5 (and was championed by Guido himself). Since then, variable annotations (plus, more recently, protocols) improved its capabilities even further. Over the last two years, tools, such as mypy, could build on top of it. Slowly, these annotations have emerged from a proof of concept state (e.g., mypy API planning) to becoming a stable feature. In this presentation, I'll tell my story of using type hints for both adding type hinting and checking type correctness for a library supporting both Python 2 and 3 and reusing this information to insert type data into the generated Sphinx documentation automatically.

Understanding Python’s Debugging Internals

Liran Haimovitch
Friday 3:15 p.m.–4 p.m. in Room 26A/B/C

Knowing your enemies is as important as knowing your friends. Understanding your debugger is a little of both. Have you ever wondered how Python debugging looks on the inside? On our journey to building a Python debugger, we learned a lot about its internals, quirks and more. During this session, we’ll share how debugging actually works in Python. We’ll discuss the differences between CPython and PyPy interpreters, explain the underlying debugging mechanism and show you how to utilize this knowledge at work and up your watercooler talk game.

What is a PLC and how do I talk Python to it?

Jonas Neubert
Friday 12:10 p.m.–12:40 p.m. in Grand Ballroom A

Walk into any factory and you will see a Programmable Logic Controller (PLC). It's the small box that has a memory card and an Ethernet cable on one side, and lots of colorful wires connected to the other end. Inside runs the logic that turns inputs from sensors into outputs to robots, conveyor belts and other machinery. PLCs evolved from relay banks in the 1970s and have ruled the world of industrial automation since then. In the first half of this talk we will take a look at how they work, how to program them, and why a strange language called "ladder logic" is (still) the lingua franca for programming them. In a short on-stage demo I will write some PLC code to control a device on stage. It's 2019 now and just running a PLC isn't quite enough anymore. Everyone is talking about the "Industrial Internet of Things" and they have connected their PLCs to the company network. The second half of the talk will look at how we can connect to PLCs to read data and influence the running program with Python.

What's new in Python 3.7

Dustin Ingram
Saturday 1:55 p.m.–2:25 p.m. in Room 26A/B/C

Python 3.7 is here! In this talk, we’ll explore several new standard library modules, new syntax features and changes, and other performance improvements and implementation changes and what it means for us as Python developers. We’ll also briefly chat about what exciting things didn’t quite make it into 3.7, but we should expect to see in 3.8.

Wily Python: Writing simpler and more maintainable Python

Anthony Shaw
Friday 2:35 p.m.–3:05 p.m. in Room 26A/B/C

Everyone starts with the best intentions with their Python projects, "this time it's going to be clean, simple and maintainable". But code evolves over time, requirements change and codebases can get messy and complicated quickly. In this talk, you will learn how to use `wily` to measure and graph how complicated your Python code is and a series of practical techniques to simplify it. `wily` will show you which parts of your projects are becoming or have become hard to maintain and need a refactor. Once you know where the skeletons are, you will learn practical techniques for refactoring "complex" code and some resources to use to take your refactoring to the next level.

Working with Time Zones: Everything You Wish You Didn't Need to Know

Paul Ganssle
Sunday 2:30 p.m.–3 p.m. in Room 26A/B/C

Time zones are complicated, but they are a fact of engineering life. Time zones have [skipped entire days](http://www.bbc.com/news/world-asia-16351377) and repeated others. There are time zones that switch to [DST twice per year](https://www.timeanddate.com/time/zone/morocco/casablanca). But not necessarily every year. In Python it's even possible to create datetimes with non-transitive equality (`a == b`, `b == c`, `a != c`). In this talk you'll learn about Python's time zone model and other concepts critical to avoiding datetime troubles. Using `dateutil` and `pytz` as examples, this talk covers how to deal with ambiguous and imaginary times, datetime arithmetic around a Daylight Savings Time transition, and datetime's new `fold` attribute, introduced in Python 3.6 ([PEP 495](https://www.python.org/dev/peps/pep-0495/)).