Posters
Building low-carbon websites with Wagtail & Django
How much do you know about websites’ carbon footprint? The web’s environmental sustainability is an important consideration for some, and an afterthought for others. But everyone needs a better understanding of exactly what moves the needle – and simple things that can reduce one’s footprint.
Wagtail is an open source content management system built with Python and Django. In recent years, Wagtail has made huge investments in lowering websites’ carbon footprint. We will show some of the background research, major sustainability improvements in Wagtail, and how to leverage these capabilities in Wagtail to reduce your own website's carbon footprint, including: - Automated checks that are available to identify improvements - Images: beyond file size, what moves the needle - Tools to measure the energy consumption of Python code specifically
Weather Data along the International Space Station Orbits
We write a Python workflow that performs several web scraping tasks to gather future positions of the International Space Station (ISS) and weather forecast data (surface temperature, surface pressure, cloud coverage, etc.) on those positions. We then perform reverse geocoding to identify the countries and oceans of the individual locations. The collected data are combined to create interactive maps where along ISS future orbits, we can pinpoint at any location, the weather conditions and the country (and associated flag) or the ocean. We also plot on the maps the major world cities along orbits. This work can be used by individuals who want to know when ISS will fly over their cities and what will be the weather.
Solving the Python-Javascript dilemma with Reflex--a modern OSS Python web framework
Reflex is the open-source (Apache 2.0 & 20k+ GitHub stars) framework empowering Python developers to build internal (data, AI, and web) apps faster - with no JavaScript required. Build both your frontend and backend in a single language, Python. (pip install reflex)
Under the hood we transpile your app into a React frontend and FastAPI backend--with all of your app logic running server-side in Python. Reflex automatically syncs your UI with your app's state through a WebSocket connection. Build and deploy apps faster with 60+ UI components, AI, and single command deployment.
Atom to Innovation: Machine Learning for Materials Science
Materials science is traditionally limited by time-consuming experimental methods for predicting new material properties. Our poster demonstrates how machine learning can dramatically accelerate materials discovery through computational techniques. By using Python libraries like PyTorch, DScribe, and ASE, we will simulates and predict material properties before they're physically created.
Imagine designing materials for solar panels, electronic devices, or medical implants through a computational model. Our approach uses neural network to transform atomic structures into predictive insights, identifying patterns that human researchers might miss. We can predict critical material characteristics like mechanical strength, electrical conductivity with unprecedented speed and accuracy.
The poster will help connect computational methods with practical scientific research, providing an interactive demonstration of how machine learning can enhance materials design.
PyFMS: accessing Fortran based methods for Python climate and weather models
A software productivity gap exists between legacy climate and weather models and the capabilities of high performance computing infrastructures. Hardware-independent, Python-based climate and weather models are in development to close this gap.To ease data generation and management for these models, and to build on existing capabilities, legacy methods are leveraged through interfaces to Python built-in tools and through language interoperability methods. Specifically, we describe the creation and use of a Fortran-C-Python interface, using methods from Geophysical Fluid Dynamics Laboratory’s (GFDL) Fortran based Flexible Modeling System (FMS) in Pace, a Python domain specific language climate and weather model.
Visual Prototyping for Woven Bands
How do you create a custom design for a woven band in an unfamiliar weaving style? Weaving relies on repeating patterns, so it can be easy to describe how a project should look, but actually picturing it is a different matter.
My goal was to design a band with diamonds, mountains, and other southwestern elements to be woven on an inkle loom. Plain weave on an inkle loom only uses repeating A-B patterns, which doesn’t allow for shapes like a diamond. I wanted to weave with a style where the pattern is built-in when you set up the loom with the warp, rather than manipulated by hand for each row, which is more error-prone. I'd heard of an A-B-A-C pattern style known as Turned Krokbragd, which could work. This weaving style is originally from Sweden, so finding existing patterns for my southwestern USA theme would be unlikely.
This Python project let me visualize my own patterns, try out different colors, and add additional design elements like a border around a central pattern. The prototyping tool creates a more realistic visual by accounting for the physical spacing of the warp threads.
Improving in Chess using Python
Chess, one of the most iconic board games in history, has a global community nearing a billion players. With the help of powerful chess engines, computers have democratized access to advanced chess analysis tools - tools that even top grandmasters use to analyze and improve their games.
The good news is that these top chess engines are free, and as Python programmers, we can utilize them to enhance our own chess skills. Chess positions and moves are denoted using standard notations like PGN and FEN, making it easier to analyze and visualize them programmatically.
This poster will explore how to use Python to:
- Parse chess games using the python-chess
library.
- Analyze games with the help of chess engines to identify mistakes and learn from them.
- Visualize games and positions using the chess-board
library.
From Monolithic to Mosaic: Collaborative SLMs Ecosystems for Cost-Efficient, Scalable, and Edge-Ready solutions
Large Language Models (LLMs) excel at tasks like filtering, summarization, and code generation, but their high computational requirements lead to steep costs and limited scalability. This poster proposes a novel approach that breaks from the constraints of monolithic LLMs, moving toward a lightweight ecosystem of open-source Small Language Models (SLMs) coordinated by a central Master Agent.
In this architecture, the Master Agent dynamically assigns requests to specialized Worker Agents, each running an SLMs (Phi3, Orca-mini, etc) tuned for a particular function. By distributing tasks among smaller, focused models, we reduce resource consumption and cost. Scaling up or adapting to new tasks becomes as simple as adding or swapping Worker Agents, avoiding the complexity and overhead of deploying an all-encompassing large model.
Beyond cost and scalability, this approach addresses the growing demand for edge-compatible solutions. Compact SLMs integrate seamlessly into edge devices ranging from IoT sensors to mobile apps which enables low-latency, privacy-preserving, and even offline language processing. This setup empowers developers to bring advanced language tasks directly to the user, regardless of infrastructure or connectivity constraints.
Our implementation, primarily in Python and supported by open-source frameworks like Langchain, Hugging Face LLMs, etc. to foster community-driven enhancements. The poster will showcase how this multi-agent framework optimizes resource utilization, simplifies maintenance through modular specialization, and ensures robust failover for uninterrupted performance. Attendees will learn how to integrate these components into their own projects, benefiting from a flexible, future-proof platform that’s both affordable and adaptable.
Migrating from legacy to pythonic toolsets to create an ecosystem of earth-system modeling tools
The Geophysical Fluid Dynamics Lab (GFDL) of the National Oceanic and Atmospheric Administration (NOAA) strives to understand the Earth’s climate through high resolution computer models. These models are tested for reproducibility, run, post-processed, and analyzed using tools created in the Flexible Model System (FMS) Runtime Environment (FRE). The previous release, FRE Bronx, is written in Perl and encompasses one monolithic workflow. While FRE Bronx has proven to work, it is not without its limitations. Functionalities of the workflow can not be extended to other workflows. Additionally, most steps of the workflow are not easily accessible to the user, creating a sense of mystery for what they are doing. This monolithic workflow makes it difficult to have a flexible workflow that can support legacy systems in addition to modern methodologies .
In order to make NOAA GFDL models more accessible and easier to develop, the next generation of FRE tools is being rewritten with the foundations of portability, flexibility, and simplicity through a pythonic infrastructure. This new ecosystem deconstructs the functionality of the monolithic tools into a “tool subtool” structure through a re-envisioned user interface called FRE-cli. Reforming the FRE tools with python allows for a modernization of code and increases modularity, maintainability, and extensibility. This tool infrastructure is CI tested with pytest and strives to conform to Python Best Practices, ultimately improving the code’s quality. The FRE rewrite project is in active development as more functionality and tools are added to FRE.
Data About Your Data: Open Source Python Tools for Scientific Metadata Management
Within many scientific domains, such as computational biology, there continues to be explosive data generation. Along with this newly generated data, also comes its associated metadata. This additional metadata is beneficial for scientific researchers, because it adds context and enhances downstream data analysis.
However, with the surge of metadata, there are now new challenges with sharing, storing, writing, and integrating metadata into sample analysis due to the lack of metadata handling and standardization. This issue arises primarily because more focus has been given to the actual data rather than both the data and its metadata. These challenges affect numerous scientific fields including genomics, astronomy, and climate science.
To solve this problem, we present an open source python-based toolkit (PEPkit) for organizing large-scale, sample-intensive scientific research projects and their associated metadata. We cover the entire life cycle of data and its metadata including: metadata capture, storage, standardization, validation, interoperability, and shareability. We also discuss reusable data processing with its integrated metadata using workflow management systems (WFMS) that are metadata aware.
Beautiful and Balanced: Using Color Theory in Data Visualization
When presenting data visually, it's important to choose color palettes which do not skew the viewer's perception of the data relationships you're illustrating. Our brains interpret color contextually: proximity to other colors changes the way we perceive a particular color, making it appear darker or lighter, more prominent or more demure. With some understanding of these color relationships, you can make palette choices which do not inadvertently introduce bias to the data being displayed.
In this poster presentation, we'll take a peek at some basic color theory based on the work of Josef Albers, look at what color weight is, and how you can use it to evaluate your palette choices. We'll go over some strategies for choosing color palettes that keep your data presentation both unbiased and visually pleasing, and list some resources for learning more.
ExecExam: Streamlining Python Assessments with Automation and Personalized Feedback
Assessing Python executable examinations at scale is challenging for educators who want to provide timely and actionable feedback to their students. Executable examinations, where participants write, modify, or execute Python code to solve specific problems, are widely used in educational environments to foster learning and evaluate student progress. However, delivering accurate and constructive feedback on these assessments at scale remains difficult. This poster introduces ExecExam, a tool designed to streamline and automate the assessment of Python programming tasks through Pytest integration, enhancing student learning and the evaluation process .
ExecExam leverages Pytest to automatically verify student solutions against pre-defined test cases but extends its functionality through advanced features such as tailored feedback reports, a user-friendly interface, and integration with large language models for enhanced programming advice. These additions make ExecExam more than just a Pytest test runner, offering educators and students a comprehensive assessment and learning tool. The system generates detailed reports summarizing test outcomes, highlighting successes as well as areas needing improvement.
ExecExam also has an optional inclusion of large language models (LLMs), which provide students with detailed, step-by-step recommendations on how to fix their code. With the integration of LiteLLM, a unified API, this tool provides sophisticated, context-aware feedback for failing tests, guiding students toward effective resolution of errors while fostering a deeper understanding of programming concepts. Unlike simply receiving a list of failed tests, students can obtain specific suggestions for code fixes or alternative approaches that will pass the tests. This method cultivates a student’s programming education by giving them steps on how to improve.
ExecExam helps instructors identify areas for improvement and empowers students to acknowledge their mistakes and enhance their skills. Currently used by college students and instructors, this tool proves valuable in computer science education, particularly for Python learners.
AmpliconFinder: A Python Based Bioinformatics Pipeline for Early Stage Cancer Diagnosis
Early diagnosis is vital for significantly improving treatment outcomes for cancer patients, as the chances of successful treatment and survival greatly increase when the disease is caught before it spreads. A promising area of research in early cancer detection focuses on tiny, circular pieces of DNA called extrachromosomal circular DNAs (ecDNAs) or amplicons. These DNA circles are produced by tumor cells and can be identified in the blood or tissue samples of patients with various types of cancer. Recent advancements in DNA sequencing technology, capable of generating individual sequence reads as long as 100 kb, have made detecting these ecDNAs feasible. As sequencing costs continue to decrease, ecDNAs are now being sequenced from thousands of patients, creating vast datasets that require sophisticated computational tools for analysis.
To address this need, we developed a bioinformatics tool named AmpliconFinder, which integrates a convolutional neural network (CNN) trained on long-read sequencing data from cancer patients with a robust bioinformatics pipeline—all implemented in Python. The tool leverages powerful libraries such as scikit-learn and TensorFlow for CNN development, and pandas, numpy, and matplotlib for data processing and visualization. Circular DNA amplicons, which serve as early-stage cancer biomarkers, can be studied using AmpliconFinder, providing a pathway for doctors to harness cost-effective sequencing data for simple, non-invasive early cancer detection, ultimately leading to more effective treatments.
A visual exploration of vectors
Vector embeddings are a way to encode a text or image as an array of floating-point numbers, and they make it possible to perform similarity search on many kinds of content. This poster will take you on a journey through vectors with as many diagrams and graphs as we can fit, to help you understand the differences between embedding models, distance metrics, quantization schemes, and input modalities. All of the visuals are generated using open-source tools like matplotlib and pandas, and all the code is available in a repo for you to try yourself. ↖ Come on a vector voyage! ↗
Painting with Python (poster)
Caleb is a mathy lenticular hologram artist, here to share the basics of doing art with Python. Each piece is a self-contained Python script that uses as few dependencies as possible (mostly just Python Image Library). No GPU acceleration. No AI. Just good old fashion vanilla Python. I'll be open-sourcing (for the first time) a few of the pieces of art I've done. For a preview of what mathy lenticular holograms look like, check out https://gods.art.
Datasette: an ecosystem of tools for finding stories in data
Datasette is an open source Python web application for exploring, analyzing and publishing data, built on top of SQLite.
In the seven years since its first release Datasette has grown a plugin system with over 150 plugins, enabling it to be applied to a continually growing set of data analysis, visualization and manipulation challenges.
Datasette's sister project, LLM, provides a Python abstraction over a large number of different Large Language Models. My recent focus has been bringing these two worlds together, building plugins for Datasette that take advantage of LLMs to provide features like structured data extraction from text and images, text-to-SQL query assistance and data cleanup operations driven by prompts.
Come and learn more about Datasette, its plugin ecosystem and ways it can be applied to fields such as data journalism, exploratory data analysis, historical archives and more.
Conformal Prediction: Quantifying Confidence in ML Model Predictions
Machine learning models are powerful and helpful, but their value increases when we understand how confident we can be in their predictions. Enter conformal prediction. Using Python libraries like MAPIE, we can create upper and lower bounds of a prediction interval for point estimates. For classification tasks, we can use the underlying probabilities of algorithms like logistic regression to understand how confident the model is in choosing a given outcome.
From a practical standpoint, obtaining a clear measurement of confidence like this allows practitioners to make relevant business decisions, for example. A subscription service may decline to launch a new feature if its underlying model cannot classify user types with high confidence. Similarly, a model that predicts monetary amounts won't be very helpful if the prediction intervals are especially wide and include extra noise.
ClimateML: Machine Learning for Climate Model Downscaling
Global climate models provide crucial insights but lack local precision, limiting their practical application for regional planning and adaptation strategies. This poster presents a Python-based framework for climate model downscaling, using deep learning to bridge the gap between global and local predictions.
The approach leverages PyTorch's neural network capabilities and Python's scientific computing stack for processing complex climate data. The framework combines xarray for multi-dimensional data handling, dask for distributed computing, and custom PyTorch layers optimized for atmospheric pattern recognition. Through MLflow's experiment tracking and Panel's interactive visualizations, we demonstrate how different neural architectures affect prediction accuracy across various regions and climate variables. This downscaling approach significantly improves prediction accuracy, reducing the Root Mean Square Error (RMSE) by 40-65% compared to traditional statistical methods, with correlation coefficients (r²) improving significantly for temperature predictions in complex terrain regions.
By integrating domain-specific libraries like metpy with NumPy's computational capabilities, we showcase how Python enables scalable climate science .Visitors to our poster can explore live examples showing how machine learning improves climate predictions. The implementation demonstrates how Python makes it possible to process large amounts of climate data efficiently and create accurate local forecasts.
Putting a Face to the Pattern: Applying Time-Series Classification & Clustering to Transmission Load Patterns
In recent times, Time Series-Based Machine Learning has been increasingly relevant to the utilities industry. As demand in electricity grows and new forms of electricity usage emerges, so does the necessity to identify load growth patterns in real-time. As such, while Time Series Classification and Clustering is still an emerging field of study in the Machine Learning space, it has been one of the most relevant topics in the utilities industry. Efforts into applying Time Series Classification/Clustering in the utilities space, specifically pertaining to load, are ongoing, with a mix of both successes and failures. Aided by packages such as Aeon and Numpy, experimentation have included classification of existing load patterns through various models such as HIVECOTEv2, DTW+KNN, etc. at various granularities (yearly, monthly, etc.). Other experiments have included transforming load data into frequencies through discrete fourier transform and applying similar models. Some of the challenges faced have included classifying classes with low class priors (few cases in training data), identifying partial time series, and identifying new load patterns in conjunction with existing loads.
This poster intends to highlight ongoing experiments, existing challenges, and the packages used.
Audience – Machine Learning Practitioners/Enthusiasts, People in Utilities Level – Some Experience