Talks

5 ways to deploy your Python web app in 2017

Andrew T. Baker
Friday 3:15 p.m.–3:45 p.m. in Portland Ballroom 252–253

You’ve built a fine Python web application and now you’re ready to share it with the world. But what’s the best way to deploy your app in 2017? This talk will demonstrate popular techniques for deploying Python web applications. We’ll start with a simple Flask application and expose it to the world five times over as we learn to use different tools and services available to the modern Python developer. Specific topics covered include: * Exposing your local dev environment with [ngrok](https://ngrok.com/) * Using a Platform-as-a-Service (PaaS) like [Heroku](https://www.heroku.com/) * Going “serverless” with [AWS Lambda](https://aws.amazon.com/lambda/) * Configuring your own VM with [Google Compute Engine](https://cloud.google.com/compute/) * Thinking inside the box using [Docker](https://www.docker.com/) We’ll also briefly touch on the pros and cons of each technique to help you figure out which one is right for your app. At the end of this talk you will have a basic understanding of how each of these techniques work and you’ll be ready to try them out yourself.

A gentle introduction to deep learning with TensorFlow

Michelle Fullwood
Friday 4:15 p.m.–5 p.m. in Oregon Ballroom 203–204

Deep learning's explosion of spectacular results over the past few years may make it appear esoteric and daunting, but in reality, if you are familiar with traditional machine learning, you're more than ready to start exploring deep learning. This talk aims to gently bridge the divide by demonstrating how deep learning operates on core machine learning concepts and getting attendees started coding deep neural networks using Google's TensorFlow library.

aiosmtpd - A better asyncio based SMTP server

Barry Warsaw
Sunday 2:30 p.m.–3 p.m. in Oregon Ballroom 203–204

smtpd.py has been in the standard library for many years. It's been a common tool for deploying SMTP and LMTP servers that handle email-based communication in Python, providing both basic protocol implementations and a fundamental module for higher level tools, such as lazr.smtptest for testing email clients. Based on asyncore and asynchat, smtpd.py is showing its age, and its API is unwieldy. Fortunately, there's a new alternative available. aiosmtpd is a modern reinvention based on asyncio, with all the improvements that come along with such a new implementation. It provides servers for both the SMTP and LMTP protocols, as well as a higher level "controller" API for testing SMTP and LMTP clients. It exposes a much better API for customization, allowing the user to associate a simple "handler" to process incoming messages without having to worry about the details of the protocols, and it provides some useful hooks for subclassing. This talk will describe the purpose and history of smtpd.py and aiosmtpd, show how users can extend the servers and implement specialized handlers, and show how applications can use the testing API for ensuring that their email sending applications do the right things. Examples will be taken from GNU Mailman 3, which uses aiosmtpd extensively.

Algorithmic Music Generation

Padmaja V Bhagwat
Friday 5:10 p.m.–5:40 p.m. in Portland Ballroom 251 & 258

Music is mainly an artistic act of inspired creation and is unlike some of the traditional math problems. Music cannot be solved by a simple set of formulae. The most interesting and challenging part is producing unique music without infringing the copyright. The generated music has to sound good, and what sounds good is very subjective and varies from culture to culture. Artificial Neural Network/Deep Learning has a wide range of applications, such as in Image processing, Natural language processing, Time series prediction, etc. But what about its usage in art? Could we use deep learning to create music? This talk is about how deep learning models were used to produce music - catering particularly to Bollywood. This talk would show how an exquisite piece of art i.e. music can be generated using deep learning model which helps in automated feature extraction. In order to automate the music generation, the model must be able to remember the learned features over the longer period of time, this is achieved by a special type of Recurrent Neural Network (RNN) called as LSTM (Long Short Term Memory) network. Implementation of such complex model can be made much easier using inbuilt Python libraries such as Keras with Theano as backend. It allows for easy and fast prototyping. Packages like numpy and scipy are being used for easier mathematical computation of input vectors and for reading/writing the WAV files respectively. The neural network architecture makes use of numerous amount of music samples to train the model. After an adequate number of iterations and training time, this model generates music that is unique and original. In this talk, steps involved in preprocessing of data, training the model, testing the model and generating the music from the trained model will be discussed. This talk will also cover some of the challenges and tradeoffs made for algorithmic music generation.

An Introduction to Reinforcement Learning

Jessica Forde
Saturday 1:40 p.m.–2:25 p.m. in Portland Ballroom 252–253

Reinforcement learning (RL) is a subfield of machine learning focused on building agents: software that can robustly achieve a desired objective under varying states of the world. This introduction will provide you with an overview of RL and tools to build your own agents. In this talk, we will provide an overview of terminology in reinforcement learning and a Jupyter Notebook outlining basic algorithms to learn 'policies', strategies for an agent, and visualize them with numpy, pandas, and seaborn. Newer developments in reinforcement learning apply deep learning to improve performance. We will further discuss deep reinforcement learning and how to use deep learning libraries, such as TensorFlow or Theano, with the latest RL libraries: [OpenAI Gym][1], [OpenAI Universe](https://universe.openai.com), and [DeepMind Lab](https://github.com/deepmind/lab). [1]: https://gym.openai.com

async/await and asyncio in Python 3.6 and beyond

Yury Selivanov
Sunday 1:10 p.m.–1:40 p.m. in Oregon Ballroom 203–204

The talk overviews async/await, asynchronous generators and comprehensions in Python 3.6 and the asyncio module. We'll discuss when and how asyncio should be used in a modern applications and services, what is uvloop, and what asyncio frameworks and libraries one should use. I'll share our ideas about where asyncio is headed and what to expect in Python 3.7.

Asynchronous Python for the Complete Beginner

Miguel Grinberg
Sunday 1:50 p.m.–2:20 p.m. in Oregon Ballroom 203–204

With the introduction of the asyncio package in Python 3.4, you can hear lots of people talking about asynchronous programming, most in a favorable way, some not so much. In this talk, I will tell you what this async fever is about and what can it do for you that regular Python can't, not only with asyncio, but also with other frameworks that existed long before it.

Automate AWS With Python

Moshe Zadka
Saturday 10:50 a.m.–11:20 a.m. in Portland Ballroom 252–253

AWS is one of the best-known cloud vendors. Using the Web UI is fine when starting out, but automating cloud operations is important. Boto3 provides a great Pythonic API to AWS, but using it correctly can be subtle. The talk will cover how to automate AMI builds, building Cloud Formation Templates and automating S3 bucket management.

Awesome Command Line Tools

Amjith Ramanujam
Saturday 11:30 a.m.–noon in Portland Ballroom 252–253

Designing a good command line tool is challenging. Command line tools look archaic compared to modern graphical interfaces, discoverability is a big issue for command line tools, proactive help is hard to implement. This talk will cover a set of specific techniques to help alleviate those challenges. How did [pgcli](http://pgcli.com) and [mycli](http://mycli.net) overcome these problems? We will cover specific examples where command line apps shine. The examples will be drawn from pgcli, mycli and [bpython](http://bpython-interpreter.org/). This talk will suggest libraries and show how they can help you implement a wonderful command line interface. The libraries covered in this talk include [python-prompt-toolkit](https://github.com/jonathanslenders/python-prompt-toolkit), [pygments](http://pygments.org/), [click](http://click.pocoo.org/), [fuzzyfind](https://github.com/amjith/fuzzyfinder). The goal of the talk is to distil the ideas that exist in successful command line applications as guidelines for building powerful command line applications.

Bayesian Statistical Analysis with Python

Eric J. Ma
Sunday 2:30 p.m.–3 p.m. in Portland Ballroom 251 & 258

You've got some data, and now you want to analyze it with Python. You're on your way to greatness! Now the problem comes: do I do the t-test? Chi-squared test? How do I decide? In this talk, inspired by many Pythonista Bayesians (@jakevdp, @allendowney, @twiecki, @fonnesbeck) before, I will show you how you can take common statistical decision problems, formulate them as a Bayesian analysis problem, and use PyMC3 as your workhorse tool for gaining insights. This talk will be math-light and code-heavy, and if you download the slides, you'll have a simple template for more complex Bayesian analysis down the road!

Big picture software testing: unit testing, Lean Startup, and everything in-between

Itamar Turner-Trauring
Friday 11:30 a.m.–noon in Portland Ballroom 252–253

There are many ways you can test your software: unit testing, manual testing, end-to-end testing, and so forth. Take a step back and you'll discover even more form of testing, many of them very different in their goals: A/B testing, say, where you see which of two versions of your website results in more signups or ad clicks. How do these forms of testing differ, how do they relate to each other? How do you choose which kind of testing to pursue, given limited time and resources? How do you deal with strongly held yet opposite views arguing either that a particular kind of testing is essential or that it's a waste time? This talk will provide you with a model, a way to organize all forms of testing and understand what exactly they provide, and why. Once you understand the model you will be able to choose the right form of testing for *your* situation and goals.

Building A Gigaword Corpus: Lessons on Data Ingestion, Management, and Processing for NLP

Rebecca Bilbro
Friday 4:30 p.m.–5 p.m. in Portland Ballroom 251 & 258

As the applications we build are increasingly driven by text, doing data ingestion, management, loading, and preprocessing in a robust, organized, parallel, and memory-safe way can get tricky. This talk walks through the highs (a custom billion-word corpus!), the lows (segfaults, 400 errors, pesky mp3s), and the new Python libraries we built to ingest and preprocess text for machine learning. While applications like Siri, Cortana, and Alexa may still seem like novelties, language-aware applications are rapidly becoming the new norm. Under the hood, these applications take in text data as input, parse it into composite parts, compute upon those composites, and then recombine them to deliver a meaningful and tailored end result. The best applications use language models trained on _domain-specific corpora_ (collections of related documents containing natural language) that reduce ambiguity and prediction space to make results more intelligible. Here's the catch: these corpora are huge, generally consisting of at least hundreds of gigabytes of data inside of thousands of documents, and often more! In this talk, we'll see how working with text data is substantially different from working with numeric data, and show that ingesting a raw text corpus in a form that will support the construction of a data product is no trivial task. For instance, when dealing with a text corpus, you have to consider not only how the data comes in (e.g. respecting rate limits, terms of use, etc.), but also where to store the data and how to keep it organized. Because the data comes from the web, it's often unpredictable, containing not only text but audio files, ads, videos, and other kinds of web detritus. Since the datasets are large, you need to anticipate potential performance problems and ensure memory safety through streaming data loading and multiprocessing. Finally, in anticipation of the machine learning components, you have to establish a standardized method of transforming your raw ingested text into a corpus that's ready for computation and modeling. In this talk, we'll explore many of the challenges we experienced along the way and introduce two Python packages that make this work a bit easier: [Baleen](https://pypi.python.org/pypi/baleen/0.3.3) and [Minke](https://github.com/bbengfort/minke). Baleen is a package for ingesting formal natural language data from the discourse of professional and amateur writers, like bloggers and news outlets, in a categorized fashion. Minke extends Baleen with a library that performs parallel data loading, preprocessing, normalization, and keyphrase extraction to support machine learning on a large-scale custom corpus.

Building Stream Processing Applications

Amit Ramesh, Qui Nguyen
Sunday 2:30 p.m.–3 p.m. in Oregon Ballroom 201–202

Do you have a stream of data that you would like to process in real time? There are many components with Python APIs that you can put together to build a stream processing application. We will go through some common design patterns, tradeoffs and available components / frameworks for designing such systems. We will solve an example problem during the presentation to make these points concrete. Much of what will be presented is based on experience gained from building production pipelines for the real-time processing of ad streams at Yelp. This talk will cover topics such as consistency, availability, idempotency, scalability, etc.

Community powered packaging: conda-forge

Filipe Pires Alvarenga Fernandes
Friday 1:55 p.m.–2:25 p.m. in Oregon Ballroom 201–202

The Python scientific community always wanted a package manager that is cross platform, does not require `sudo`, and lets Python be awesome! The conda package manager solved that problem, but created a new ones... This talk is a tour disguised as a beginner tutorial to `conda-forge` packaging. We will try to discuss some myths and misconceptions about `conda` and `conda-forge`, as well as a quick comparison with `pip` and `wheels`.

Constructive Code Review

Erik Rose
Friday 3:15 p.m.–4 p.m. in Portland Ballroom 254–255

“Your code is bad and you are bad. Have a bad day.” Too many code reviews feel like this, and it saps the enthusiasm that drives open source. Instead, let’s explore how to give reviews that are truthful but encouraging, boosting the skill level of contributors and the quality of the project. We’ll look at “tact hacks” that nudge communication in a friendly direction, antipatterns to avoid, the pesky human emotions that can tempt us into reviewing poorly, and techniques for leveling up newcomers without losing all your coding time.

Cython as a Game Changer for Efficiency

Alex Orlov
Saturday 5:10 p.m.–5:40 p.m. in Portland Ballroom 251 & 258

Are you running a Web application? Do you suffer from CPU bottlenecks that slow down your growth? There's a tool that can easily fix all that, and then some. C++ knowledge not required. Come learn how Instagram, the world's largest Django deployment with more than 600M active users, saved ~30% of global CPU by rewriting a handful of modules on the critical path in Cython. Learn to apply those techniques to your own projects with little effort and stop worrying about switching to other programming languages or rewriting stable components in C++.

Dask: A Pythonic Distributed Data Science Framework

Matthew Rocklin
Friday 4:15 p.m.–5 p.m. in Portland Ballroom 252–253

Dask is a general purpose parallel computing system capable of Celery-like task scheduling, Spark-like big data computing, and Numpy/Pandas/Scikit-learn level complex algorithms, written in Pure Python. Dask has been adopted by the PyData community as a Big Data solution. This talk focuses on the distributed task scheduler that powers Dask when running on a cluster. We'll focus on how we built a Big Data computing system using the Python networking stack (Tornado/AsyncIO) in service of its data science stack (NumPy/Pandas/Scikit Learn). Additionally we'll talk about the challenges of effective task scheduling in a data science context (data locality, resilience, load balancing) and how we manage this dynamically with aggressive measurement and dynamic scheduling heuristics.

Debugging in Python 3.6: Better, Faster, Stronger

Elizaveta Shashkova
Saturday 10:50 a.m.–11:20 a.m. in Oregon Ballroom 201–202

Python 3.6 was released in December of 2016 and it has a lot of new cool features. Some of them are quite easy for using: a developer can read, for example, about f-strings and they can start using them in their programs as soon as possible. But sometimes features are not so evident, and a new frame evaluation API is one of them. The new frame evaluation API was introduced to CPython in PEP 523 and it allows to specify a per-interpreter function pointer to handle the evaluation of frames. It might not be evident how to use this new feature in everyday life, but it’s quite easy to understand how to build a fast debugger based on it. In this talk we are going to explain how standard way of debugging in Python works and how a new frame evaluation API may be useful for creating the fast debugger. Also we will consider why such fast debugging was not possible in the previous versions of Python. If someone hasn’t made a final decision to move to Python 3.6 this talk will provide some new reasons to do it.

Decorators, unwrapped: How do they work?

Katie Silverio
Saturday 2:35 p.m.–3:05 p.m. in Oregon Ballroom 201–202

Decorators are a syntactically-pleasing way of modifying the behavior of functions in Python. However, they can be highly opaque to Python beginners. It took me a while to learn how to write one, and even after I was confident writing my own decorators, felt like they were magical. The goal of this talk is to demystify decorators by methodically stepping through how and why they work. Along the way we'll touch on closures, scopes, and how Python is compiled.

Designing secure APIs with state machines

Ashwini Oruganti, Mark Williams
Saturday 2:35 p.m.–3:05 p.m. in Portland Ballroom 251 & 258

Did you ever need to create an application whose behavior varies with its state, while still presenting a consistent interface to its callers? A good, layered design using state machines can help avoid the tedious 'if' checks for flags, and ensure that if your code runs at all, it will run with all the required values initialized. I will demonstrate this with examples, and talk about some available tools and libraries to build state machines in Python. I will also discuss how to effectively use the process of threat modeling to build secure web applications. Threat modeling is a computer security technique that helps you better understand the systems you create, identify attacks, and build defenses. I will talk about things that we, as software developers, can do to assess the security of our applications in the real world through this process.

Dial M For Mentor

Mariatta Wijaya
Friday 1:55 p.m.–2:25 p.m. in Portland Ballroom 254–255

One of the nicest things about Python community is the availability of mentors willing to help you. Various mentors have helped me navigate the open source community and help advanced my skills. I realized finding a mentor is not as easy as it seems, and it takes a lot of courage to reach out in the first place. And then, there is impostor syndrome, where one may feel like they don’t deserve the help. In this talk, I will provide advice about working with a mentor. Asking for help is not a failure.

Dr. Microservices, Or How I Learned to Stop Worrying and Love the API

Ryan Anguiano
Friday 2:35 p.m.–3:05 p.m. in Portland Ballroom 252–253

Assuming that you already know how to build a monolithic app, you must be wondering how you can use all this "microservice" stuff that you keep hearing about. Well, a good word of advice is that you probably don't need it. If designed properly, a monolithic app should be able to scale and fit the needs of most businesses. Even so, you should keep your development as simple as possible until you have proven and solidified your business concepts. But if you do need to grow to Internet scale, then you have a long road ahead of you. Moving from a monolithic application to microservices is a natural evolution that is often of necessity. There are several competing schools of thought that are still being battle-tested in these early days of microservice architecture. Among all the competing paradigms, most of the requirements can be agreed upon, but are mostly differentiated by the tools used to fulfill the requirements. This talk will cover setting up the required infrastructure, and demonstrate how to migrate a sample monolithic Django application into a microservices platform. The demo application will use the following technologies: Django, Flask, Fabric, Terraform, Ansible, CentOS, Docker, Mesos, Consul, Nginx, Pgbouncer, Kafka

Ending Py2/Py3 compatibility in a user friendly manner

Matthias Bussonnier, Min Ragan-Kelley, M Pacer, Thomas Kluyver
Saturday 3:15 p.m.–4 p.m. in Portland Ballroom 251 & 258

> "Four shalt thou not count, neither count thou two, excepting that thou then proceed to three." >> Monty Python and the Holy Grail; Scene 33 Python 3 has been around for more than eight years, and much of the Python ecosystem is now available both on Python 2 and Python 3, often using a single code base. Nonetheless, this compatibility comes at a development cost and some library authors are considering ending support for Python 2 . These once-python-2-compatible libraries are at risk of being upgraded on non compatible system and cause user (and developer) frustration. While it may seem simple to cease support for Python 2, the challenge is not in ending support, but doing so in a way that does not wreak havoc for users who stay on Python 2. And that is not only a communications problem, but a technical one : up until recently, it was impossible to tag a release as Python 3 only; today it is possible. Like any maintainer of a widely used library, we want to ensure that users continue to use Python 2 continue to have functioning libraries, even after development proceeds in a way that does not support Python 2. One approach is to ensure easy installation of older versions if possible avoid incompatible versions altogether. Users should not need to manually pin maximal version dependencies across their development environments and projects if all they want is to use the latest versions of libraries that are compatible with their system. Even if we did expect that of users, consider what would happen when a package they rely on converts to be only Python 3 compatible. If they were not tracking the complete dependency tree, they might discover, on upgrade, that their projects no longer work. To avert this they would need to pin those at the last version compatible with Python 2. Users that want to use older python versions should not have to go through so much anguish to do so. In order to solve this problem, and thereby make both users' and maintainers' lives easier, we ventured into the rabbit-hole called Packaging. Though we set off with a singular quest, our tale roves through many lands. We'll narrate the story of our amending PEPs, our efforts in building the ramparts of the pypa/Warehouse Castle, battles with the dragons of Pip, and errands in the "land of no unit tests" otherwise known as PyPI legacy. By the end of the above tale, the audience members will know the road to Python 3 only libraries had once had hazards that are now easily avoidable. So long as users upgrade their package management tools.

Executing python functions in the linux kernel by transpiling to bpf

Alex Gartrell
Saturday 3:15 p.m.–4 p.m. in Portland Ballroom 254–255

`ebpf` is a linux kernel byte-code which can be used for functionality ranging from tracing system calls with kprobe to routing packets with tc. This talk is about a pure-python front-end for ebpf that allows users to write simple python functions to be executed in the kernel. I'll first explain how this was made to work and then I'll show off some of the features/capabilities of this approach with working examples.

Experiment Assignment on the Web

Jessica Stringham
Friday 10:50 a.m.–11:20 a.m. in Oregon Ballroom 203–204

A popular way of improving websites is to run experiments on it. We split users into groups, show two or more variations of the site, measure how well each one does, and then show the best version to everyone. In this talk, I'll walk through a toy Python program that does the first step: splits users into groups. A few interesting problems arise: grouping users, whitelists, and scaling. I'll share different ways to address them. I'll also give examples of things that can go terribly wrong when designing experiment assignment code.

Exploring Network Programmability with Python and YANG

Lisa N Roach
Saturday 12:10 p.m.–12:40 p.m. in Portland Ballroom 252–253

Recently, networking vendors and Silicon Valley giants have been putting forth a concerted effort to build standardized models for networking devices. These models allow for building reusable and versatile scripts with predictable, standardized data. Without such models, the wide variety of inputs and outputs required by different devices and vendors made scripting a tedious and challenging endeavor. The modeling language in use is called YANG, and a variety of standards have emerged. A vendor agnostic standard called OpenConfig has lately become stable enough to begin programming devices with it. Using Python, YANG is surprisingly easy to work with, and extremely powerful applications can be written with basic knowledge of JSON or XML and RPCs. The talk will start with use-cases for programming networking devices, and will detail a specific, trivial, use case that will be used in the talk. Next, we will discuss the ‘legacy’ way of programming devices (SSH and screenscraping), and highlight the challenges, such as complex regular expressions, slow responses, and lack of reusability between devices. From there we will dive into YANG, focusing on OpenConfig models. A YANG model is essentially a template, and JSON or XML can be mapped to the YANG template. This makes it perfect for Pythonic manipulation. In the use case there will be a GET RPC returning a YANG representation of the box’s state in JSON, which we will search for the relevant health indicator by drilling down in the JSON dictionary. A simple change to the dictionary will remediate the problem, and a PATCH RPC merges the new configuration onto the box. Since open, standard models are in use, this script could be run on many devices across a network to achieve the same effect with no changes needed. We will finish up with the pros and cons of YANG before opening the talk for Q&A.

Factory Automation with Python - Stories about Robots, Serial Ports, and Barcode Readers

Jonas Neubert
Friday 2:35 p.m.–3:05 p.m. in Oregon Ballroom 203–204

In industrial automation _tried and tested_ always beats _latest and greatest_: The machines that make smartphones have a serial port and are configured with `.csv` files. But when your factory automates complex non-linear workflows and is jam-packed with sensors and robots, you are quickly faced with software engineering challenges that call for modern tools. Python turns out to be surprisingly versatile in this setting, whether for prototyping a single conveyor belt or taming a building full of robots. This talk explains how to use Python for interfacing with two common industrial automation devices: a barcode scanner and a Programmable Logic Controller (PLC). After a simple demo, you’ll hear about lessons learned using Python packages that weren’t written with robots in mind, like Celery and pandas, with robots.

Fuzzy Search Algorithms: How and When to Use Them

Jiaqi Liu
Saturday 5:10 p.m.–5:40 p.m. in Oregon Ballroom 203–204

Fuzzy Searching or approximate string matching is powerful because often text data is messy. For example, shorthand and abbreviated text are common in various data sets. In addition, outputs from OCR or voice to text conversions tend to be messy or imperfect. Thus, we want to be able to make the most of our data by extrapolating as much information as possible. In this talk, we will explore the various approaches used in fuzzy string matching and demonstrate how they can be used as a feature in a model or a component in your python code. We will dive deep into the approaches of different algorithms such as Soundex, Trigram/n-gram search, and Levenshtein distances and what the best use cases are. We will also discuss situations where it’s important to take into account the meaning or intent of a word and demonstrate approaches for measuring semantic similarity using nltk and word2vec. Furthermore, we will demonstrate via live coding how to implement some of these fuzzy search algorithms using python and/or built-in fuzzy search functions within PostgreSQL.

Gothic Colors: Using Python to understand color in nineteenth-century literature

Eleanor Stribling, Caroline Winter
Sunday 2:30 p.m.–3 p.m. in Portland Ballroom 254–255

Do you love literature and programming? Have you ever been curious about what the heck “Digital Humanities” are? Join us for a quick survey of what’s going on in this growing field and learn about a specific project, “Gothic Colors” where we set out to enumerate and analyze color references and mood in 19th century Gothic novels, using Python and a couple of popular libraries.

Grok the GIL: Write Fast And Thread-Safe Python

A. Jesse Jiryu Davis
Friday 12:10 p.m.–12:55 p.m. in Oregon Ballroom 201–202

I wrote Python for years while holding mistaken notions about the Global Interpreter Lock, and I've met others in the same boat. The GIL's effect is simply this: only one thread can execute Python code at a time, while N other threads sleep or await network I/O. Let's read CPython interpreter source and try some examples to grok the GIL, and learn to write fast and thread-safe Python.

Hacking Cars with Python

Eric Evenchick
Sunday 2:30 p.m.–3 p.m. in Portland Ballroom 252–253

Modern cars are networks of computers, and a high end vehicle could have nearly 100 different computers inside. These devices control everything from the engine to the airbags. By understanding how these systems work, we can interface with vehicles to read data, perform diagnostics, and even modify operation. In this talk, we'll discuss pyvit, the Python Vehicle Interface Toolkit. This library, combined with some open source hardware, allows developers to talk to automotive controllers from Python. We will begin with an introduction to automotive networks, to provide a basis for understanding the tools. Next, we will look at the tools and show the basics of using them. Finally, we'll discuss real world applications of these tools, and how they're being used in the automotive world today.

Hacking Classic Nintendo Games with Python

Sam Agnew
Sunday 1:10 p.m.–1:40 p.m. in Portland Ballroom 252–253

Do you feel like using your superpowers as a developer to bring the games of your childhood into the future with the power of the Internet? In this live coded journey, we'll build an SMS powered "Game Genie" allowing the audience to send text messages to manipulate the Nintendo games being played in real time. This will involve working with Flask, the Twilio API, the FCEUX NES emulator and how to bridge them with quick Lua scripts.

How documentation works, and how to make it work for your project

Daniele Procida
Saturday 11:30 a.m.–noon in Portland Ballroom 254–255

Nearly everyone (especially in the Python community) agrees that good documentation is important to the success of software projects, and yet very few projects actually have good documentation. Often, it's _not for want of effort_ - the project's developers have worked hard on it - _nor for lack of documentation_ - the authors have produced a lot of it. _It simply turns out to be not very good_ - not helpful enough for the users who should be able to rely on it, and a depressing chore for the authors who have to maintain it. The good news is that both these problems can be solved by understanding _how documentation works_, and what its different functions are. Structuring documentation according to those distinct functions helps ensure that each of them is adequately served. It also makes it far easier to write and maintain. Using real-life examples I'll draw out the key functions of documentation, and how they map onto different ways of writing it. Putting this into practice is simple when armed with some basic guidelines. The benefits are huge, and available with a minimum of effort. I won't be discussing documentation tools or software or other topics that have been covered amply elsewhere, but some neglected aspects of software documentation that **will make your software projects more successful**.

How to make a good library API

Flávio Juvenal
Saturday 2:35 p.m.–3:05 p.m. in Oregon Ballroom 203–204

It's not easy to write libraries with great APIs. We're aware of that. However, it's not always clear how we can follow abstract ideals like elegance, simplicity, and extensibility to improve our APIs. That's why in this talk we'll discuss good and bad APIs with real-world examples. For each thing learned, we'll come up with a checklist to help us with practical advice for writing good APIs.

How to write a Python transpiler

Russell Keith-Magee
Saturday 1:55 p.m.–2:25 p.m. in Portland Ballroom 254–255

We all know Python is a powerful and expressive programming language. What you may not know is how much of the internals of Python itself is exposed for you to use and manipulate. In this talk, you'll be introduced to the tools and libraries Python provides to manipulate the compilation and execution of Python code. You will also see how you can use those tools to target execution environments other than the CPython virtual machine.

Human-Machine Collaboration for Improved Analytical Processes

Tony Ojeda
Saturday 2:35 p.m.–3:05 p.m. in Portland Ballroom 252–253

Over the last several years, Python developers interested in data science and analytics have acquired a variety of tools and libraries that aim to facilitate analytical processes. Libraries such as Pandas, Statsmodels, Scikit-learn, Matplotlib, Seaborn, and Yellowbrick have made tasks such as data wrangling, statistical modeling, machine learning, and data visualization much quicker and easier. They have accomplished this by automating and abstracting away some of the more tedious, repetitive processes involved with analyzing and modeling data. Over the next few years, we are sure to witness the introduction of new tools that are increasingly intelligent and have the ability to automate more complex analytical processes. However, as we begin using these tools (and developing new ones), we should strongly consider the level of automation that is most appropriate for each case. Some analytical processes are technically difficult to automate, and therefore require large degrees of human steering. Others are relatively easy to automate but perhaps should not be due to the unpredictability of results or outputs requiring a level of compassionate decision-making that machines simply don’t possess. Such processes would benefit greatly from the collaboration between automated machine tasks and uniquely human ones. After all, it is often systems that utilize a combination of both human and machine intelligence that achieve better results than either could on their own. In this talk, we will discuss human-machine collaboration as it applies to analyzing data with Python. We will review a framework for exploratory data analysis with the goal of identifying which tasks should be automated, which tasks should not, and which tasks would benefit from a more interactive, symbiotic, and collaborative process between the human and the machine. We will explore Python libraries that we can use to build tools that allow us to perform different types of analysis. We’ll also introduce the Cultivar project, an example of a hybrid analytics tool that combines a Django framework with Javascript visualizations and Celery for task management to facilitate more efficient and effective human-machine systems for data analysis.

I Installed Python 3.6 on Windows and I Liked It

Steve Dower
Friday 3:15 p.m.–4 p.m. in Portland Ballroom 251 & 258

Python has a great reputation as a cross-platform language, which for many people means different varieties of Linux. But a huge number of Python users are running on Windows - a fundamentally different operating system where things do not always work the same. However, Python has always worked incredibly well across different platforms including Windows, going to great lengths to support and expose the platform without making development more complex. In this session, CPython core developer and Microsoft engineer Steve Dower will discuss some of the reasons why cross-platform support is not an accident, and how Python 3.6 makes it even easier to support both Windows and Linux.

Immutable Programming - Writing Functional Python

Calen Pennington
Friday 11:30 a.m.–noon in Portland Ballroom 254–255

The world of Haskell and functional programming may seem like a distant place to many working Python developers, but some of the techniques used there are remarkably useful when developing in Python. In this talk, I will cover some of the pitfalls of mutability that you may run into while writing Python programs, and some tools and techniques that Python has built in that will let you avoid them. You'll see namedtuples, enums and properties, and also some patterns for structuring immutable programs that will make them easier to build, extend, and test.

Implementing Concurrency and Parallelism From The Ground Up

Amber Brown
Friday 1:40 p.m.–2:25 p.m. in Oregon Ballroom 203–204

When writing an application, it is common to want to do many things at once. For web servers, this is serve multiple web requests, for GUI applications it's doing a background task whilst keeping the UI responsive. But how do we actually do that? This talk will go into how concurrency and parallelism work from the CPU, OS, and threads up, how state (data) is shared between them, and how this interacts with the functions that you, the programmer, write, and how you can write properly behaving concurrent or parallel software.

In-Memory Event Resequencing: Realistic Testing For Impossible Bugs

Glyph
Friday 10:50 a.m.–11:20 a.m. in Portland Ballroom 252–253

As we all know, we should write testable code, and automated tests. But as we also know, no test plan survives contact with the real world. Complex, distributed systems fail in complex, distributed ways, and even the simplest web app today is a complex distributed system. So, as our code accrues little fixes to bugs that only show up in production, our test suites eventually either become slow integration testing monstrosities that are "realistic" but flaky and unreliable, or useless piles of mocks which are fast and deterministic but don't give you confidence. In this talk, we'll explore how to leverage event-driven programming, or "async I/O", to structure code in such a way that its tests are fast, realistic, and reliable, even in the face of horrible race-conditions you only discover in production.

Instagram Filters in 15 Lines of Python

Michele Pratusevich
Friday 2:35 p.m.–3:05 p.m. in Oregon Ballroom 201–202

Images tell stories, and we love Instagram filters because they give emotion to our images. Do you want to explore what makes up Instagram filters? In this talk, we will talk about the basic elements of Instagram filters and implement them in Python. The staple libraries we will use are scikit-image and numpy - matplotlib and jupyter notebooks for plotting and interactivity. In the end, we will implement the (now-defunct) Gotham Instagram filter in 15 lines of Python (not including imports). Throughout the process, there will be many pretty pictures.

Introduction to Threat Modeling

Ying Li, David Lawrence
Friday 12:10 p.m.–12:55 p.m. in Portland Ballroom 254–255

Are you a website or application developer? Are you worried about security? Don’t know what you need to know, and what you can safely leave to the experts? Come learn about how to analyze your application’s design for potential security flaws, how to think like a security engineer, and see some of the most common pitfalls that programs fall victim to. In this talk we will work through the process of threat modeling - understanding how your system might get attacked, what its weak points are, and how to defend it.

It's time for datetime

Mario Corchero
Saturday 10:50 a.m.–11:20 a.m. in Oregon Ballroom 203–204

Working with time is not a trivial challenge. Python includes a native module in the standard library to work with it but datetime keeps being together with unicode a common source of errors. This often leads to the widespread of many other libraries in the attempt of easing the work of working with datetime. Datetime is one of those API that looks easy to use but given the many concepts around time, is it easy to get backfired if the developer has not solid knowledge about the them. In this talk we will overview the main concepts about timestamps represented through datetime objects, the limitations on the standard library and some simple steps to try to avoid the common mistakes that everyone can fall into. Naive datetimes (which the datetime API works by default with) are a great tool to represent calendar times, but when talking about timestamps (focus of this talk) timezones is n essential part of it and the datetime module can be tricky to use for that use cases. We will also speak about different standards of time, time zones, Daylight Saving Times, leap seconds, serialization and datetime arithmetics. The talk will be focused on giving the foundations that everyone knows to be able to understand and work efficiently and without making painful mistakes when dealing with time related algorithms.

Know thy self: Methods and method binding

Thomas Ballinger
Saturday 1:55 p.m.–2:25 p.m. in Oregon Ballroom 201–202

Methods are like functions, but different. How? Why? And what's will having to type "self" all the time? We'll explore partial application of functions and review why it might be nice to start using classes. Then to clarify how method objects work we'll examine the result of accessing the method attributes of an object without calling them. Understanding the behavior we uncover here will require more attribute lookup experiments, which will lead us discover the power of descriptors. Along the way we'll peek in at other languages' approaches to method binding, hopefully coming to appreciate the way Python does things enough to type "self" a few thousand more times.

Level up! Rethinking the Web API framework.

Tom Christie
Saturday 12:10 p.m.–12:55 p.m. in Portland Ballroom 254–255

Think there's nothing left to explore in how we design Web API frameworks? Think again. The author of Django REST framework walks through how we might approach designing a new Python-based API framework from scratch, and looks at how we can start building smarter, more productive API tooling as a result. You should come away from this talk with a better appreciation of: * How best to provide API client libraries and API documentation to your users. * How to build APIs that support both realtime and request/response interfaces. * How to build APIs that are web-browsable. * Why you might want to consider taking a schema-first approach to your API design.

Leveraging Serverless Architecture for Powerful Data Pipelines

Jason Myers
Friday 5:10 p.m.–5:40 p.m. in Portland Ballroom 252–253

Serverless Architectures that allow us to run python functions in the cloud in an event-driven parallel fashion can be used to create extremely dynamic and powerful data pipelines for use in ETL and data science. Join me for an exploration of how to build data pipelines on Amazon Web Services Lambda with python. We'll cover a single introduction to event-driven programming. Then, we'll walk through building an example pipeline while discussing some of the frameworks and tools that can make building your pipeline easier. Finally, we'll discuss how to maintain observability on your pipeline to ensure proper performance and troubleshooting information.

Library UX: Using abstraction towards friendlier APIs

Mali Akmanalp
Saturday 3:15 p.m.–3:45 p.m. in Oregon Ballroom 203–204

Complicated libraries can be a pain in the butt to use. It's not surprising that there are a lot of "X for humans" libraries out there, some of which are mostly wrappers around more frustrating interfaces. This is not a theoretical talk. I'll touch upon theory to give you context, but will then talk about what that means for you in practice so that you can write better libraries. I'll talk about why library UX matters, about abstraction as a general concept, about out what happens when you over/under abstract, and about some useful tips to help build friendly APIs. Meanwhile, I'll show some positive examples from libraries we know and love (flask, SQLAlchemy, Requests, etc). Once you recognize these effects in play, you'll be able to apply them to your own code and make life better for everyone!

Lights, camera, action! Scraping a great dataset to predict Oscar winners

Deborah Hanus
Saturday 3:15 p.m.–3:45 p.m. in Portland Ballroom 252–253

Using Jupyter notebooks and scikit-learn, you’ll predict whether a movie is likely to [win an Oscar](http://oscarpredictor.github.io/) or be a box office hit. Together, we’ll step through the creation of an effective dataset: asking a question your data can answer, writing a web scraper, and answering those questions using nothing but Python libraries and data from the Internet.

Look mum no hands! From blinking LEDs to a bike speedometer with MicroPython

Tim Head
Sunday 1:50 p.m.–2:20 p.m. in Portland Ballroom 252–253

In this talk I will show you how to use a micro-controller to build a wifi enabled speedometer for your bike, using MicroPython. And some hardware. And a bike (maybe). I will introduce you to the world of MicroPython: a python distribution that runs on micro-controllers. Micro-controllers are small computers that are all around us: in cars, TVs, and your internet connected fridge. We will start with making LEDs blink, then serve webpages, build an interrupt handler and finally put it all together to make a wifi enabled speedometer for a bike.

Looping Like a Pro in Python

David "DB" Baumgold
Friday 5:10 p.m.–5:40 p.m. in Oregon Ballroom 201–202

The humble loop: it's hard to write a program without it. Whether it's processing numbers in a sequence, lines in a text file, users in a database, or any other list of things, you use loops all the time. But did you know that Python has a lot of different ways to write loops? Reaching for the right looping tool can make your code cleaner, more readable, easier to test, and it can even make it run faster! By the end of this talk, you'll be looping like a pro, and your code will be better for it.

Magic Method, on the wall, who, now, is the `fairest` one of all?

Sep Dehpour
Saturday 1:55 p.m.–2:25 p.m. in Portland Ballroom 251 & 258

Magic methods are a very powerful feature of Python and can open a whole new door for you. However, with great power comes great responsibility. In this talk we explore magic method's capabilities by first designing new interfaces in a series of fun experiments. Secondly, we play with creating undeletable objects and learn about the mighty Garbage Collector in cPython and how a single magic method can overturn the fate of the object. Lastly, we create a lazy Redis client to illustrate a practical application of magic methods and learn about lazy loading. Once you see what magic methods can bring to the table, the limit is only your imagination!

Modern Python Dictionaries -- A confluence of a dozen great ideas

Raymond Hettinger
Saturday 12:10 p.m.–12:55 p.m. in Portland Ballroom 251 & 258

Python's dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6. This fun talk uses pictures and little bits of pure python code to explain all of the key ideas and how they evolved over time. Includes newer features such as key-sharing, compaction, and versioning.

Next Level Testing

James Saryerwinnie
Friday 12:10 p.m.–12:40 p.m. in Portland Ballroom 252–253

Unit, functional, and integration tests are great first steps towards improving the quality of your python project. Ever wonder if there’s even more you can do? Maybe you've heard of property-based testing, fuzzing, and mutation testing but you're unsure exactly how they can help you. In this talk we’ll cover additional types of tests that can help improve the quality and robustness of your python projects: property-based testing, fuzz testing, stress testing, long term reliability testing, and mutation testing. We’ll also go beyond just covering what these tests are. For each of the test types above, I’ll give you real world examples from open source software that I maintain that shows you the types of bugs each test type can find. I’ll also show you how you can integrate these tests into your Travis CI and/or Jenkins environment.

No More Sad Pandas: Optimizing Pandas Code for Speed and Efficiency

Sofia Heisler
Saturday 4:30 p.m.–5 p.m. in Oregon Ballroom 201–202

When I first began working with the Python Pandas library, I was told by an experienced Python engineer: "Pandas is fine for prototyping a bit of calculations, but it's too slow for any time-sensitive applications." Over multiple years of working with the Pandas library, I have realized that this was only true if not enough care is put into identifying proper ways to optimize the code's performance. This talk will review some of the most common beginner pitfalls that can cause otherwise perfectly good Pandas code to grind to a screeching halt, and walk through a set of tips and tricks to avoid them. Using a series of examples, we will review the process for identifying the elements of the code that may be causing a slowdown, and discuss a series of optimizations, ranging from good practices of input data storage and reading, to the best methods for avoiding inefficient iterations, to using the power of vectorization to optimize functions for Pandas dataframes.

One Data Pipeline to Rule Them All

Sam Kitajima-Kimbrel
Sunday 1:50 p.m.–2:20 p.m. in Oregon Ballroom 201–202

There are myriad data storage systems available for every use case imaginable, but letting application teams choose storage engines independently can lead to duplicated efforts and wheel reinvention. This talk will explore how to build a reusable data pipeline based on Kafka to support multiple applications, datasets, and use cases including archival, warehousing and analytics, stream and batch processing, and low-latency "hot" storage.

Optimizations which made Python 3.6 faster than Python 3.5

Victor Stinner
Friday 10:50 a.m.–11:20 a.m. in Portland Ballroom 251 & 258

Various optimizations made Python 3.6 faster than Python 3.5. Let's see in detail what was done and how. Python 3.6 is faster than any other Python version on many benchmarks. We will see results of the Python benchmark suite on Python 2.7, 3.5 and 3.6. The bytecode format and instructions to call functions were redesign to run bytecode faster. A new C calling convention, called "fast call", was introduced to avoid temporary tuple and dict. The way Python parses arguments was also optimized using a new internal cache. Operations on bytes and encodes like UTF-8 were optimized a lot thanks to a new API to create bytes objects. The API allows very efficient optimizations and reduces memory reallocations. Some parts of asyncio were rewritten in C to speedup code up to 25%. The PyMem_Malloc() function now also uses the fast pymalloc allocator also giving tiny speedup for free. Finally, we will see optimization projects for Python 3.7: use fast calls in more cases, speed up method calls, a cache on opcodes, a cache on global variables.

Packaging Let’s Encrypt: Lessons learned shipping Python code to hundreds of thousands of users

Noah Swartz
Friday 1:55 p.m.–2:25 p.m. in Portland Ballroom 251 & 258

Let's Encrypt launhced on April 12th 2016, for the first time allowing anyone access to free SSL certificates that could be automatically fetched and renewed. The demand was massive, and so was the need for a client to fetch these certificates for all of those users. This client is called Certbot, and it's written entirely in Python. Unfortunately for the sanity of Certbot developers, these users of Let's Encrypt can't decide on a single operating system to use! This requires us to ship our software, and all of its dependencies, to a variety of systems all with different web servers, Python versions, package managers,and underlying packages. Learn how we got through this mess!

Passing Exceptions 101: Paradigms in Error Handling

Amandine Lee
Friday 11:30 a.m.–noon in Oregon Ballroom 201–202

Exception handling in Python can sometimes feel like a Wild West. If you have a `send_email` function, and the caller inputs an invalid email address, should it: A) Return `None` or some other special return value, B) Let the underlying exception it might cause bubble up, C) Check via a regex and type checking and raise a `ValueError` immediately, or D) Make a custom `EmailException` subclass and raise that? What if there is a network error while the email was sending? Or what if the function calls a helper `_format_email` that returns an integer (clearly wrong!), or raises an `TypeError` itself? Should it crash the program or prompt a retry? This talk will introduce the concept of an exception, explain the built-in Python exception hierarchy and the utility of custom subclasses, demonstrate try/except/finally/else syntax, and then explore different design patterns for exception control flow and their tradeoffs using examples. It will also make comparisons to error handling philosophy in other languages, like Eiffel and Go.

Piecing it Together: A beginner's guide to application configuration

Mary Nagle
Sunday 1:10 p.m.–1:40 p.m. in Portland Ballroom 254–255

Assembling all the necessary setup for an application you’re building can often be more frustrating than writing the app itself. Learning to do this well is difficult, especially for those who are new to Python and might not know where to begin or what questions to ask. While there is no “right way” to set up a development environment or application, understanding the components involved and how they interact can empower you to customize your setup to best suit your needs. This talk will dive into what happens when setting up a database, the purpose and configuration of an isolated environment, how Python packages are installed, and finally, how each of these components interact with each other and the application itself; in particular, how an application's structure facilitates said interactions.

Prehistoric Patterns in Python

Lennart Regebro
Friday 11:30 a.m.–noon in Portland Ballroom 251 & 258

Why do some code use dictionaries that have None for all values? Is it true that you shouldn't concatenate strings with +? Will Python optimize constant calculations? This talk will go through some patterns that used to be common in Python, but which now are regarded as outdated and see if they really are outdated and why. The results surprised me, maybe they'll surprise you.

Probabilistic Programming with PyMC3

Christopher Fonnesbeck
Sunday 1:10 p.m.–1:40 p.m. in Portland Ballroom 251 & 258

Bayesian statistics offers robust and flexible methods for data analysis that, because they are based on probability models, have the added benefit of being readily interpretable by non-statisticians. Until recently, however, the implementation of Bayesian models has been prohibitively complex for use by most analysts. But, the advent of probabilistic programming has served to abstract the complexity of Bayesian statistics, making such methods more broadly available. PyMC3 is a open-source Python module for probabilistic programming that implements several modern, computationally-intensive statistical algorithms for fitting Bayesian models, including Hamiltonian Monte Carlo (HMC) and variational inference. PyMC3’s intuitive syntax is helpful for new users, and the reliance on Theano for much of the computational work has allowed developers to keep the code base simple, making it easy to extend the software to meet analytic needs. PyMC3 itself extends Python's powerful "scientific stack" of development tools, which provide fast and efficient data structures, parallel processing, and interfaces for describing statistical models.

Python for mathematical visualization: a four-dimensional case study

David Dumas
Saturday 5:10 p.m.–5:40 p.m. in Portland Ballroom 252–253

This is a talk about creating pictures of a mathematical object---specifically, a 4-dimensional fractal "dust" that has been the subject of mathematical research in hyperbolic geometry since the 1980s. In the end this is accomplished using a little algebra, a little geometry, and a healthy dose of Python. That is, I will present a case study of using Python in several aspects of a mathematical visualization project, from the computation itself, to transforming and converting data, and finally for scripting the process of generating the images. Along the way I'll explain how Python's convenient idioms and containers (e.g. sets and set comprehensions) are a good fit for some of the algebraic and geometric questions that come up, how Scipy and Numpy enable fast numerical calculations, and how Python's strength as a language for scripting and automation allows easy orchestration of rendering of still images and frames of animations. The mathematical visualization project we describe is a collaboration with François Guéritaud (Université de Lille).

Python from Space: Analyzing Open Satellite Imagery Using the Python Ecosystem

Katherine Scott
Friday 3:15 p.m.–4 p.m. in Oregon Ballroom 201–202

Earth imaging satellites, just like our computers, are shrinking and becoming more ubiquitous than ever before. It is now possible to obtain open satellite data on a daily if not weekly basis and for this data to be put to work; helping us better understand our planet and quickly respond to disaster situations. In this talk we will work through a jupyter notebook that covers the satellite data ecosystem and the python tools that can be used to sift through and analyze that data. Topics include python tools for using Open Street Maps data, the Geospatial Data Abstraction Library (GDAL), and OpenCV and NumPy for image processing. This talk is intended for novice and intermediate python developers who are interested in using data science and satellite imagery for social good and fundamental scientific research.

Python in The Serverless Era

Benny Bauer
Sunday 1:50 p.m.–2:20 p.m. in Portland Ballroom 254–255

Serverless is the latest phase in the evolution of cloud development. Its building blocks are functions, a bunch of stateless “nano-services”, that can scale automatically and charged only when used. It enables teams to focus more on development while having fully managed servers. In this talk I'll cover the Serverless Architectures practices, use cases, tooling and the role python plays in it.

Rants and Ruminations From A Job Applicant After 💯 CS Job Interviews in Silicon Valley

Susan Tan
Friday 2:35 p.m.–3:05 p.m. in Portland Ballroom 254–255

What is it like to interview at 1 technology company? Stressful and tiring. What is it like to interview at 100 technology companies? I have done that. In late August 2016, I quit an uninspiring full-time software job and talked to 100 employers in the San Francisco Bay Area to find the best fit. The hiring process reflects the company culture and its values. Listen to my rants and ruminations of interviewing at tiny seed-stage startups to large technology companies in Silicon Valley. Learn how to reform your own hiring process to be more considerate and thoughtful. Learn how to prepare for interviews efficiently.

Readability Counts

Trey Hunner
Saturday 11:30 a.m.–noon in Oregon Ballroom 201–202

Have you found unreadable PEP8-compliant code and wondered how to fix it? Have you ever seen code that was simply a pleasure to read? If you've ever wondered what makes code easy to read, this talk is for you. During this talk we'll learn a number of techniques for refactoring code to improve readability and maintainability. We'll discuss: - whitespace - self-documenting code - modularity - expectation management We'll end with a checklist for improving the readability of your own code.

Re-Programming the Human Genome with Python

Riley Doyle
Friday 4:30 p.m.–5 p.m. in Portland Ballroom 254–255

Modern genome editing techniques such as CRISPR-Cas9 are revolutionizing the way we discover and treat the root genetic causes of disease. Many of the most popular tools and libraries in this cutting edge application are written in Python. This talk will provide a general, software-centric introduction to the exciting new area of genome editing, describe the central string search, machine learning, and data management problems involved, and review how Python frameworks and libraries are used today to solve these problems in Production in order to benefit human health. This talk assumes no prior lab experience: only a proficiency with Python and curiosity!

Requests Under The Hood

Cory Benfield
Friday 10:50 a.m.–11:20 a.m. in Oregon Ballroom 201–202

Requests is widely acknowledged as a library that saves users an enormous amount of time, effort, and pain through its intuitive and clear API. For this reason, most people who have never looked at the code assume that its code is as intuitive, well-structured, and clear as the API. Of course, the truth is more complex than that. Real software that deals with real problems is rarely ideal: there are edge cases, terrible hacks, and awkward workarounds for problems. Often in the software industry we pretend that these imperfections in our software don’t exist, or we try to hide them. These imperfections frequently cause people to reinvent wheels in order to simplify the code, which has benefits for understandability but frequently has downsides for resilience. When people talk about “battle-tested” code, they mean code that has been dirtied up over time from its original Platonic ideal implementation to something that is just as complex and warty as real life. In this talk, one of the Requests and urllib3 core maintainers lays bare all of the worst and hackiest corners of the codebases of these two libraries. The goal is to help expose all of the invisible work done in mature codebases to tolerate edge cases and misbehaviour, as well as to try to remind us all that the perfect is the enemy of the good.

Share Your Code! Python Packaging Without Complication

Dave Forgac
Sunday 1:10 p.m.–1:40 p.m. in Oregon Ballroom 201–202

If you want people to use your code you should package it! You may have heard that packaging is hard but the Python packaging ecosystem has evolved a lot over the years. Taking your beautiful code and sharing it with the world is complex but it doesn't have to be complicated. In this talk you will learn how to take advantage of modern tooling and practices so you can get boring stuff out of the way, publish quickly and frequently, and focus on your code. This talk will cover: - A (brief) history of Python packaging - Python Packaging User Guide recommendations - Distribution formats - Anatomy of a package - Automating package creation - Adding: - Testing - CI - Documentation - Testing package installation - Releasing to PyPI This talk is for you if you're new to Python packaging and would like to learn how to share you code or if you've worked with Python for a while and just aren't up-to-date with the latest packaging practices.

Slot or not: higher performance custom objects in pure Python

Aaron Hall
Saturday 4:30 p.m.–5 p.m. in Portland Ballroom 254–255

`__slots__` are versatile for certain kinds of uses and users, if you know how they work. At first glance, they seem like a free lunch, with improvements in both time and space. At second glance, they seem to have so many caveats to make them not worth using. This talk is a deep dive into how `__slots__` work, how to wring every benefit out, as well as the actual caveats and alternatives, with recommendations for writers of core libraries as well as end users.

Snakes on a Hyperplane: Python Machine Learning in Production

Jessica Lundin
Friday 5:10 p.m.–5:40 p.m. in Oregon Ballroom 203–204

Companies with an artificial-intelligence plan have a differentiating strategy in the intelligence economy; however, implementing robust machine-learning in production is nontrivial, often requiring a close collaboration between data scientists and developers, and retooling the production stack and workflows to develop and maintain accurate models. Machine learning in production involves model application, handling missing data, data artifacts, and data outside of the training calibration. A rigorous evaluation framework draws upon logging to determine characteristics of model coverage, model performance, auditing, and run-time performance. Model coverage includes the number of times the model produced sensible output relative to number of times it is called. Model coverage is reduced if the model does not converge or model criteria are not met. Model performance is evaluated with a suite of metrics (accuracy, AUC, FPR, TPR, RMSE, MAPE, etc.), which assist in determining the most appropriate model to use in the production scenario and the validity of the model training. Regularly performing manual audits for spot checks is important for debugging and ensuring the model passes sanity checks. Model performance includes run times and profiling model pieces, ensuring performance is within specified requirements and refactoring otherwise. In the AI renaissance, where ML is a critical piece of intelligent products, seamlessly integrating model evaluation into workflows is an important component of making robust products and building a satisfying customer experience. Python is a great language to build intelligent products with its abundance of ML libraries and wrappers contributed as open-source software in addition to rich full-stack capabilities.

Snek in the Browser

Katie McLaughlin
Friday 2:35 p.m.–3:05 p.m. in Portland Ballroom 251 & 258

Python is a decades-strong language with a large community, and it has a solid foundation on the server, but it doesn't have a good user story in the browser... until now. The BeeWare project aims to bring Python natively, everywhere. Using a combination of the Batavia and Toga projects, we can develop and entirely native web experience in Python, no JavaScript required. During this talk, you will learn about how the BeeWare project has built Batavia, a Python virtual machine in JavaScript; and Toga, a multi-platform native API wrapper; a combination of which can be used to build an entire web platform in Python only.

Solid Snakes or: How to Take 5 Weeks of Vacation

Hynek Schlawack
Friday 1:40 p.m.–2:25 p.m. in Portland Ballroom 252–253

No matter whether you run a web app, search for gravitational waves, or maintain a backup script: being responsible for a piece of software or infrastructure means that you either get a pager right away, or that you get angry calls from people affected by outages. Being paged at 4am in everyday life is bad enough. Having to fix problems from hotel rooms while your travel buddies go for brunch is even worse. And while incidents can’t be prevented completely, there are ways to make your systems more reliable and minimize the need for (your!) manual intervention. This talk will help you to get calm nights and relaxing vacations by teaching you some of them.

Static Types for Python

Jukka Lehtosalo, David Fisher
Saturday 12:10 p.m.–12:55 p.m. in Oregon Ballroom 201–202

Over the past year and a half, Dropbox has been investing in the development of mypy, a static type checker for Python, as a way to make our multimillion-line Python codebase easier to understand, navigate, and maintain. In this talk, we will discuss the benefits of type annotations, explain how to use them, and give a peak into how mypy works behind the scenes. Mypy is an open-source type-checker for Python which supports the PEP 484 standard for gradual typing. Originally created by Jukka Lehtosalo as part of his PhD thesis in 2013, it is now under active development by a small team at Dropbox which includes David Fisher, Greg Price, and Guido van Rossum. It supports Python 3.2 and higher, as well as Python 2.7 (via type comments).

Temporal Data Structures with SQLAlchemy and Postgres

Joseph Leingang
Saturday 11:30 a.m.–noon in Oregon Ballroom 203–204

SQLAlchemy ([http://www.sqlalchemy.org](http://www.sqlalchemy.org/)) and Postgres ([https://www.postgresql.org](https://www.postgresql.org/)) provide several useful tools that allow us to build and query records through time: _temporal models_. Combining a need to have robust auditing, as well feature development on per-property history, we can turn “regulatory overhead” into an exciting technical challenge. At Clover Health we have built a small library to automate the task of decorating a model and making it “temporal.” This talk aims to demonstrate the underlying data model and interface for building this system.

Text is More Complicated Than You Think: Comparing and Sorting Unicode

Morgan Wahl
Saturday 1:40 p.m.–2:25 p.m. in Oregon Ballroom 203–204

Few people realize just how complicated text can be. Did you know sorting and even case-folding can depend on a user's locale? That different strings of characters can be semantically completely equivalent? That there are over a thousand Latin letters? Legacy text encodings like ASCII made a lot of simplifying assumptions about how written languages work, and we all put up with them because it was cool to even have computers in the first place. Unicode removes many of those assumptions and provides the tools we need to write software that can just do the right thing regardless of what text users throw at it. Even if you don't translate your UI, getting the details of string comparison, sorting, and searching right can eliminate annoying surprises for you and your users.

The Dictionary Even Mightier

Brandon Rhodes
Saturday 3:15 p.m.–4 p.m. in Oregon Ballroom 201–202

Since my “Mighty Dictionary” talk at PyCon 2010, the Python dictionary has evolved dramatically. Come learn about all of the the improvements, up to and including the re-architecture that has just landed with Python 3.6! The talk will discuss iterable views, the dictionary’s dedicated comprehension syntax, random key ordering, the special key-sharing dictionary designed to underlie object collections, and, most famously of all, the new “compact dictionary” that cuts dictionary storage substantially — and carries a fascinating side-effect. Each new feature that the talk discusses will be motivated by considering the trade-offs inherent in hash table data structure design, and followed up with hints about how you can now use the dictionary even more effectively in your own code!

The Fastest FizzBuzz in the West: Make Your Own Language with RPLY and RPython

Dustin Ingram
Saturday 2:35 p.m.–3:05 p.m. in Portland Ballroom 254–255

In this talk, you'll learn how I built DIVSPL (Dustin Ingram's Very Special Programming Language), a tongue-in-cheek domain-specific language, which is particularly good for implementing FizzBuzz -- as quickly as possible. We'll build DIVSPL with RPLY, an implementation of David Beazley's PLY (but with a "cooler" API) and make it compatible with RPython, a restricted subset of the Python programming language. Along the way, you'll learn about lexers, parsers, and grammars, and in the end, you'll know how to build your own language.

The Gilectomy: How's It Going?

Larry Hastings
Friday 12:10 p.m.–12:55 p.m. in Portland Ballroom 251 & 258

One of the most interesting projects in Python today is Larry Hastings' "Gilectomy" project: the removal of Python's Global Interpreter Lock, or "GIL". Come for an up-to-the-minute status report: what's been tried, what has and hasn't worked, and what performance is like now.

The Glory of pdb's set_trace

Nicole Zuckerman
Friday 4:30 p.m.–5 p.m. in Oregon Ballroom 201–202

Everyone needs to debug code, and it can take up a non-trivial portion of our time to wait for code to complete execution and write print messages to stdout. There’s one function in particular in the python debugger (pdb) library that can give you a much clearer understanding of what’s going on in your code, much more quickly; pdb.set_ trace(). In this talk, we’ll identify the most useful things you can do when you use set trace, that can make debugging exponentially more efficient and enjoyable.

The Memory Chronicles: A Tale of Two Pythons

Kavya Joshi
Saturday 10:50 a.m.–11:20 a.m. in Portland Ballroom 251 & 258

MicroPython is the leanest, meanest full Python implementation. Designed for microcontrollers, this variant of Python runs in less than 300KB of memory, _and_ retains support for all your favorite Python features. So what does it take to make the smallest Python? Put differently, why does CPython have a large memory footprint? This talk will explore the internals of MicroPython and contrast it with CPython, focusing on the aspects that relate to memory use. We will delve into the Python object models in each and the machinery for managing them. We will touch upon how the designs of the bytecode compiler and interpreter of each differ and why that matters.

The Next Step: Finding Model Parameters With Random Walks

Christine Waigl
Sunday 1:50 p.m.–2:20 p.m. in Portland Ballroom 251 & 258

The statistician John Tukey -- who designed the box plot and coined the term "bit" -- wrote: "An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem". Python has become one of the major languages for statistical data analysis, not least because of the expressiveness of the language itself and the availability of tools like Jupyter Notebooks, which enable iterative reasoning about a problem and its solutions. This talks takes one step beyond an introduction to statistics with Python and aims to familiarize the audience with two concepts: a class of problems (so-called inverse problems), and a powerful statistical tool (the random walk, or more formally Markov-Chain Monte Carlo (MCMC) sampling with the Metropolis algorithm). In inverse problems, model parameters are estimated from observational data. Both model and data are expected to be affected by error. The objective is not only to find parameters that best describe the observations, but also to figure out how good, or how possibly bad, a solution might be. Inverse problems are extremely common in many fields and crop up each time we attempt to reconstruct a reality from sensor, radar, scattering or imaging data. The Metropololis-Hastings algorithm offers a solution via random sampling of a Bayesian posterior distribution. Even though listed as one of the 20th century's top 10 algorithms by the journal _Computing in Science & Engineering_, the Metropolis algorithm is easy to understand and implement, and a fun and instructive way to explore even complicated multi-variate probability distributions.

The Python Visualization Landscape

Jake VanderPlas
Saturday 4:30 p.m.–5 p.m. in Portland Ballroom 252–253

So you want to visualize some data in Python: which library do you choose? From Matplotlib to Seaborn to Bokeh to Plotly, Python has a range of mature tools to create beautiful visualizations, each with their own strengths and weaknesses. In this talk I’ll give an overview of the landscape of dataviz tools in Python, as well as some deeper dives into a few, so that you can intelligently choose which library to turn to for any given visualization task.

The trends in choosing licenses in Python ecosystem

Anwesha Das
Saturday 10:50 a.m.–11:20 a.m. in Portland Ballroom 254–255

The software licenses are the permissions over copyrighted software. The permission and/ grant includes the grant to use, to redistribute, to prepare derivative works etc. These software licenses also set forth the limitations over these aforesaid rights. The software licenses basically marks the boundary for the usage of the code. Therefore from the above mentioned introduction it very clear that it is very important for developers to choose the license for their code wisely and correctly. PyPI, the Python Package Index is a repository of software for the Python programming language. There are currently 80000+ packages there. This talk will go through the licenses of the top 2500 packages. We will see the trend of choosing a license for these top Python projects. We will discuss the licenses individually, compare them with each other. the advantages and disadvantages of the same. We will further explain that why a license and/or licenses are being favored by the developers.

The Wild West of Data Wrangling

Sarah Guido
Friday 5:10 p.m.–5:40 p.m. in Portland Ballroom 254–255

Data science introductory courses might give you the impression that dealing with data is neat, tidy, and simple. They present you with a simplistic dataset and the scikit-learn or Pandas documentation, and a day or so later, you're done! Piece of cake, right? The real world of data isn't that easy! As a data scientist who has worked in the industry for several years, I have had a lot of experience dealing with messy, inaccurate, incomplete data, and I want to share those experiences with you. I'll talk my way through three real-world situations where I've had to analyze and build models on untidy and complex data, going through how I've preprocessed the data and prepared it for modeling. You'll leave with an understanding of how a data scientist thinks about data and what she does when the data is complicated.

Title Available On Request: An Introduction to Lazy Evaluation

Joe Jevnik
Friday 10:50 a.m.–11:20 a.m. in Portland Ballroom 254–255

Lazy evaluation, also known as "call by need", is an evaluation strategy where values are produced only when needed. Lazy evaluation is the opposite of eager evaluation, Python's normal evaluation model, where functions are executed as seen and values are produced immediately. In this talk we will define lazy evaluation and contrast it with eager evaluation. We will discuss tools that exist in Python for using lazy evaluation and show how we can build on the primitives to better represent computations. We will introduce common vocabulary for discussing evaluation models, and compare different systems for implementing lazy evaluation. Finally, we will discuss optimizations that can be made to optimize lazily evaluated expressions.

Tracing, Fast and Slow: Digging into and improving your web service’s performance

Lynn Root
Saturday 5:10 p.m.–5:40 p.m. in Oregon Ballroom 201–202

Do you maintain a [Rube Goldberg](https://s-media-cache-ak0.pinimg.com/564x/92/27/a6/9227a66f6028bd19d418c4fb3a55b379.jpg)-like service? Perhaps it’s highly distributed? Or you recently walked onto a team with an unfamiliar codebase? Have you noticed your service responds slower than molasses? This talk will walk you through how to pinpoint bottlenecks, approaches and tools to make improvements, and make you seem like the hero! All in a day’s work. The talk will describe various types of tracing a web service, including black & white box tracing, tracing distributed systems, as well as various tools and external services available to measure performance. I’ll also present a few different rabbit holes to dive into when trying to improve your service’s performance.

Type uWSGI; press enter; what happens?

Asheesh Laroia, Philip James
Friday 11:30 a.m.–noon in Oregon Ballroom 203–204

You're a pretty knowledgeable Python web application developer, but how does that web application get served to the world? For many of us, uWSGI is the magic that makes our application available, and in this talk we'll look at how uWSGI works with the OS and the networking stack to make the magic happen.

Unicode: what is the big deal?

Łukasz Langa
Saturday 5:10 p.m.–5:40 p.m. in Portland Ballroom 254–255

Ever wondered why people complain that text processing is a hard problem? Or why Python 3 would introduce such a big backward incompatibility with switching to Unicode? Wonder no more, this talk is for you. In 30 minutes I'm going to demonstrate real world text processing problems and how Python 3 helps solve them. The talk is going to explain how you should split your text from binary data in your application, what are sensible defaults and what are possible gotchas. All this sprinkled with a healthy dose of frustration by a guy whose first name starts with Ł.

Web identity: OAuth2 and OpenIDConnect

Brendan McCollam
Friday 12:10 p.m.–12:40 p.m. in Oregon Ballroom 203–204

Interested in adding single sign-on to your application, but confused about the variety of different web authentication methods out there? OAuth, OAuth2, OpenID, OpenIDConnect, SAML, Facebook Connect? This talk will clarify the different protocols, examining OAuth2 and OpenIDConnect in greater detail. It will demonstrate a basic client implementation using FLOSS libraries, and briefly touch on some of the issues involved in server implementation.

What's in your pip toolbox?

Jon Banafato
Friday 3:15 p.m.–3:45 p.m. in Oregon Ballroom 203–204

`pip` is a great tool, but dependency management doesn't stop there. I'll explore several tools that work with `pip` to make managing your dependencies easier, faster, and safer. I'll cover generating dependencies a better way, maintaining your `requirements.txt` for the long-term, and exploring existing Python environments. Afterward, you'll never want to `pip freeze > requirements.txt` again.

What's new in Python 3.6

Brett Cannon
Saturday 12:10 p.m.–12:40 p.m. in Oregon Ballroom 203–204

Python 3.6 has turned out to be quite the release! With [16 Python Enhancement Proposals](https://www.python.org/dev/peps/pep-0494/) incorporated into the version, Python 3.6 is only surpassed by Python 3.0 for having more [PEPs](https://www.python.org/dev/peps/) included in a single release. This talk will be an overview of those 16 PEPs and other changes outlined in the [What's New](https://docs.python.org/3.6/whatsnew/3.6.html) document for Python 3.6.

When the abyss gazes back: staring down Python's surprising internals

David Wolever
Saturday 11:30 a.m.–noon in Portland Ballroom 251 & 258

Python's fantastic until it isn't. This talk dives into some of the surprising implementation details of CPython, then explains exactly how they could be discovered from first principles. Attendees will leave with some dangerous Python trivia, and the tools they'll need to uncovery their own trivia when surprises strike. The talk takes a deep dive into a StackOverflow question asking why `"x" in ("x", )` is faster than `"x" == "x"` (http://stackoverflow.com/questions/28885132/why-is-x-in-x-faster-than-x-x/28885213#28885213), including a discussion of `dis.disassemble`, the Pyhton stack machine, and reading the CPython source. If time permits, there will be other fun examples, a whirlwind tour of debugging, and a couple of homework assignments.

Writing a C Python extension in 2017

Jean-Baptiste Aviat
Saturday 4:30 p.m.–5 p.m. in Portland Ballroom 251 & 258

This talk describes the build of a C Python extension, with prebuilt binaries, in 2017, where modern packaging standards, as well as Docker, have been a game changer in the Python extensions world. Most examples come from our experience building [PyMiniRacer][1], an embedded Python / JavaScript bridge used in production across hundreds of companies. We will describe the different aspects of building a binary extension, including: - using the modern manylinux wheel type in order to ship a built binary, usable in most Linux distributions; - the choices offered to developers when building an extension: the Python public C API, cffi, ...; - testing of a binary module across various platforms; - troubleshooting & debugging an extension: the basics you need to tackle most common issues. [1]: https://github.com/sqreen/PyMiniRacer

Yes, It's Time to Learn Regular Expressions

Al Sweigart
Saturday 4:30 p.m.–5 p.m. in Oregon Ballroom 203–204

Regular expressions have a reputation as opaque and inscrutable. However, the basic concepts behind "regex" and text pattern recognition are simple to grasp. This talk is for any programmer who isn't familiar with Python's re module and its best practices. Stop putting it off, it's time to learn regular expressions!