Change the future

Enabling High Throughput Immunobiology by Integrating Django, numpy, matplotlib, and SQLAlchemy

Jacob Rothenbuhler

Audience level:
Intermediate
Category:
Best Practices/Patterns

Description

Nodality is applying a novel technology, Single Cell Network Profiling (SCNP), to reveal biology and predict clinical outcome. We face unique engineering challenges related to lab workflows, mining complex data, and presenting data in compelling interactive visuals. We will share design and implementation considerations for integrating heterogeneous software tools needed to meet these challenges.

Abstract

Nodality is a biotechnology startup applying a novel flow cytometry-based technology called Single Cell Network Profiling (SCNP) in the areas of oncology and autoimmunity to reveal underlying disease biology. We use SCNP to characterize individual patients with the aim of selecting optimal, individualized treatment strategies. For every patient sample we collect the response of every cell of multiple types to a variety of molecular stimuli; these responses are measured on multiple readouts representing key biological signaling pathways. We have developed two categories of applications to support these efforts: Python scripts enable computational scientists to analyze data from the command line and Django-based web applications provide laboratory scientists with tools to expedite their workflows and visualize their results. Here we show how these two application sets have been integrated using a shared codebase of core domain logic which provides more reusable, testable and flexible code.

Nodality faces a number of unique software engineering challenges. SCNP is a complex experiment generating very complex data sets. Powerful tools are needed to analyze and visualize this data. Command-line scripts utilize libraries such as NumPy, pandas, PyMongo, and SQLAlchemy to mine information and develop models critical to enhancing our understanding of disease. Django-based web applications simplify the design and execution of complicated laboratory workflows. We also use Django to deploy interactive visualizations which aid in interpreting results. Making all of these technologies work together is a difficult task. Packages have been created to abstract numerous dependencies into a vocabulary suitable for SCNP. The extraction of SCNP-related logic into self-contained objects has created an extremely robust codebase allowing developers to stay ahead of the rapid pace of a biotechnology startup.

Python makes it possible to create the diverse tools needed to facilitate the SCNP workflow from experiment design, to the laboratory, all the way through data analysis. Integrating these tools can be challenging but the benefits of doing so extend beyond code reuse. We will share some design and implementation considerations for bringing together these very heterogeneous software elements as well as provide examples of best practices we have developed.