Posters

Putting a Face to the Pattern: Applying Time-Series Classification & Clustering to Transmission Load Patterns

Sunday 10 a.m.–1 p.m. in Hall A

In recent times, Time Series-Based Machine Learning has been increasingly relevant to the utilities industry. As demand in electricity grows and new forms of electricity usage emerges, so does the necessity to identify load growth patterns in real-time. As such, while Time Series Classification and Clustering is still an emerging field of study in the Machine Learning space, it has been one of the most relevant topics in the utilities industry. Efforts into applying Time Series Classification/Clustering in the utilities space, specifically pertaining to load, are ongoing, with a mix of both successes and failures. Aided by packages such as Aeon and Numpy, experimentation have included classification of existing load patterns through various models such as HIVECOTEv2, DTW+KNN, etc. at various granularities (yearly, monthly, etc.). Other experiments have included transforming load data into frequencies through discrete fourier transform and applying similar models. Some of the challenges faced have included classifying classes with low class priors (few cases in training data), identifying partial time series, and identifying new load patterns in conjunction with existing loads.

This poster intends to highlight ongoing experiments, existing challenges, and the packages used.

Audience – Machine Learning Practitioners/Enthusiasts, People in Utilities Level – Some Experience

How to refactor with confidence

Phillip Schanely

Sunday 10 a.m.–1 p.m. in Hall A

Refactoring is a crucial part of the development process. But refactoring can be scary: how do you know that your changes haven't introduced subtle bugs? Perhaps your refactored code now raises an exception in some cases, or has an edge case that returns a slightly different value. Unit tests can help, but what can you do about unexpected differences in behavior?

This poster will show you how to use CrossHair's diffbehavior command, your secret weapon for confident refactoring. CrossHair uses a constraint solver to efficiently find inputs that reveal discrepancies between your original code and the refactored version.

Stop by to learn more about CrossHair's diffbehavior command and start refactoring with confidence!

Conformal Prediction: Quantifying Confidence in ML Model Predictions

Alexander Olden

Sunday 10 a.m.–1 p.m. in Hall A

Machine learning models are powerful and helpful, but their value increases when we understand how confident we can be in their predictions. Enter conformal prediction. Using Python libraries like MAPIE, we can create upper and lower bounds of a prediction interval for point estimates. For classification tasks, we can use the underlying probabilities of algorithms like logistic regression to understand how confident the model is in choosing a given outcome.

From a practical standpoint, obtaining a clear measurement of confidence like this allows practitioners to make relevant business decisions, for example. A subscription service may decline to launch a new feature if its underlying model cannot classify user types with high confidence. Similarly, a model that predicts monetary amounts won't be very helpful if the prediction intervals are especially wide and include extra noise.

Datasette: an ecosystem of tools for finding stories in data

Simon Willison

Sunday 10 a.m.–1 p.m. in Hall A

Datasette is an open source Python web application for exploring, analyzing and publishing data, built on top of SQLite.

In the seven years since its first release Datasette has grown a plugin system with over 150 plugins, enabling it to be applied to a continually growing set of data analysis, visualization and manipulation challenges.

Datasette's sister project, LLM, provides a Python abstraction over a large number of different Large Language Models. My recent focus has been bringing these two worlds together, building plugins for Datasette that take advantage of LLMs to provide features like structured data extraction from text and images, text-to-SQL query assistance and data cleanup operations driven by prompts.

Come and learn more about Datasette, its plugin ecosystem and ways it can be applied to fields such as data journalism, exploratory data analysis, historical archives and more.

From Monolithic to Mosaic: Collaborative SLMs Ecosystems for Cost-Efficient, Scalable, and Edge-Ready solutions

Sunday 10 a.m.–1 p.m. in Hall A

Large Language Models (LLMs) excel at tasks like filtering, summarization, and code generation, but their high computational requirements lead to steep costs and limited scalability. This poster proposes a novel approach that breaks from the constraints of monolithic LLMs, moving toward a lightweight ecosystem of open-source Small Language Models (SLMs) coordinated by a central Master Agent.

In this architecture, the Master Agent dynamically assigns requests to specialized Worker Agents, each running an SLMs (Phi3, Orca-mini, etc) tuned for a particular function. By distributing tasks among smaller, focused models, we reduce resource consumption and cost. Scaling up or adapting to new tasks becomes as simple as adding or swapping Worker Agents, avoiding the complexity and overhead of deploying an all-encompassing large model.

Beyond cost and scalability, this approach addresses the growing demand for edge-compatible solutions. Compact SLMs integrate seamlessly into edge devices ranging from IoT sensors to mobile apps which enables low-latency, privacy-preserving, and even offline language processing. This setup empowers developers to bring advanced language tasks directly to the user, regardless of infrastructure or connectivity constraints.

Our implementation, primarily in Python and supported by open-source frameworks like Langchain, Hugging Face LLMs, etc. to foster community-driven enhancements. The poster will showcase how this multi-agent framework optimizes resource utilization, simplifies maintenance through modular specialization, and ensures robust failover for uninterrupted performance. Attendees will learn how to integrate these components into their own projects, benefiting from a flexible, future-proof platform that’s both affordable and adaptable.

AmpliconFinder: A Python Based Bioinformatics Pipeline for Early Stage Cancer Diagnosis

Sunday 10 a.m.–1 p.m. in Hall A

Early diagnosis is vital for significantly improving treatment outcomes for cancer patients, as the chances of successful treatment and survival greatly increase when the disease is caught before it spreads. A promising area of research in early cancer detection focuses on tiny, circular pieces of DNA called extrachromosomal circular DNAs (ecDNAs) or amplicons. These DNA circles are produced by tumor cells and can be identified in the blood or tissue samples of patients with various types of cancer. Recent advancements in DNA sequencing technology, capable of generating individual sequence reads as long as 100 kb, have made detecting these ecDNAs feasible. As sequencing costs continue to decrease, ecDNAs are now being sequenced from thousands of patients, creating vast datasets that require sophisticated computational tools for analysis.

To address this need, we developed a bioinformatics tool named AmpliconFinder, which integrates a convolutional neural network (CNN) trained on long-read sequencing data from cancer patients with a robust bioinformatics pipeline—all implemented in Python. The tool leverages powerful libraries such as scikit-learn and TensorFlow for CNN development, and pandas, numpy, and matplotlib for data processing and visualization. Circular DNA amplicons, which serve as early-stage cancer biomarkers, can be studied using AmpliconFinder, providing a pathway for doctors to harness cost-effective sequencing data for simple, non-invasive early cancer detection, ultimately leading to more effective treatments.

ClimateML: Machine Learning for Climate Model Downscaling

Neeraj Pandey

Sunday 10 a.m.–1 p.m. in Hall A

Global climate models provide crucial insights but lack local precision, limiting their practical application for regional planning and adaptation strategies. This poster presents a Python-based framework for climate model downscaling, using deep learning to bridge the gap between global and local predictions.

The approach leverages PyTorch's neural network capabilities and Python's scientific computing stack for processing complex climate data. The framework combines xarray for multi-dimensional data handling, dask for distributed computing, and custom PyTorch layers optimized for atmospheric pattern recognition. Through MLflow's experiment tracking and Panel's interactive visualizations, we demonstrate how different neural architectures affect prediction accuracy across various regions and climate variables. This downscaling approach significantly improves prediction accuracy, reducing the Root Mean Square Error (RMSE) by 40-65% compared to traditional statistical methods, with correlation coefficients (r²) improving significantly for temperature predictions in complex terrain regions.

By integrating domain-specific libraries like metpy with NumPy's computational capabilities, we showcase how Python enables scalable climate science .Visitors to our poster can explore live examples showing how machine learning improves climate predictions. The implementation demonstrates how Python makes it possible to process large amounts of climate data efficiently and create accurate local forecasts.

Infomaid -- An Offline AI Text-Generative Tool To Support Ethical Learning With Textual Assistance

Oliver Bonham-Carter

Sunday 10 a.m.–1 p.m. in Hall A

Generative AI technologies are gaining popularity as tools to assist users in managing and completing common textual tasks. However, web-hosted AI tools create privacy concerns since users cannot safely experiment with their generative systems without leaving behind pieces of personal data.

To address this issue, we propose Infomaid – an innovative textual AI assistant driven by the Ollama AI framework. This offline tool serves as both an AI teaching tool and a textual assistant that is free from Internet dependency. By allowing users to submit prompts and receive organized output reports in Markdown format, Infomaid facilitates safe experimentation with AI-driven text processing, generation, and assistance.

Written in Python, Infomaid's modular design encourages users to experiment, extend, and customize its capabilities, fostering a hands-on and diverse learning experience. To facilitate the archival (or publication) of results, one of Infomaid’s key features is the ability to document outputs in neatly structured Markdown files, ensuring clarity and organization. This format enables easy integration into documentation workflows or collaborative projects.

The Infomaid project, hosted on GitHub as an open source project, is designed for educators, students, and professionals who want to explore and learn about ethical AI-driven text processing, generation, and assistance in a safe, private, and offline context. Whether summarizing documents, generating reports, drafting content, or organizing information, Infomaid serves to simplify complex tasks and workflows into seamless operations, making it a helpful resource for areas with limited Internet access or users prioritizing data privacy and security.

Data About Your Data: Open Source Python Tools for Scientific Metadata Management

Sunday 10 a.m.–1 p.m. in Hall A

Within many scientific domains, such as computational biology, there continues to be explosive data generation. Along with this newly generated data, also comes its associated metadata. This additional metadata is beneficial for scientific researchers, because it adds context and enhances downstream data analysis.

However, with the surge of metadata, there are now new challenges with sharing, storing, writing, and integrating metadata into sample analysis due to the lack of metadata handling and standardization. This issue arises primarily because more focus has been given to the actual data rather than both the data and its metadata. These challenges affect numerous scientific fields including genomics, astronomy, and climate science.

To solve this problem, we present an open source python-based toolkit (PEPkit) for organizing large-scale, sample-intensive scientific research projects and their associated metadata. We cover the entire life cycle of data and its metadata including: metadata capture, storage, standardization, validation, interoperability, and shareability. We also discuss reusable data processing with its integrated metadata using workflow management systems (WFMS) that are metadata aware.

ExecExam: Streamlining Python Assessments with Automation and Personalized Feedback

Sunday 10 a.m.–1 p.m. in Hall A

Assessing Python executable examinations at scale is challenging for educators who want to provide timely and actionable feedback to their students. Executable examinations, where participants write, modify, or execute Python code to solve specific problems, are widely used in educational environments to foster learning and evaluate student progress. However, delivering accurate and constructive feedback on these assessments at scale remains difficult. This poster introduces ExecExam, a tool designed to streamline and automate the assessment of Python programming tasks through Pytest integration, enhancing student learning and the evaluation process .

ExecExam leverages Pytest to automatically verify student solutions against pre-defined test cases but extends its functionality through advanced features such as tailored feedback reports, a user-friendly interface, and integration with large language models for enhanced programming advice. These additions make ExecExam more than just a Pytest test runner, offering educators and students a comprehensive assessment and learning tool. The system generates detailed reports summarizing test outcomes, highlighting successes as well as areas needing improvement.

ExecExam also has an optional inclusion of large language models (LLMs), which provide students with detailed, step-by-step recommendations on how to fix their code. With the integration of LiteLLM, a unified API, this tool provides sophisticated, context-aware feedback for failing tests, guiding students toward effective resolution of errors while fostering a deeper understanding of programming concepts. Unlike simply receiving a list of failed tests, students can obtain specific suggestions for code fixes or alternative approaches that will pass the tests. This method cultivates a student’s programming education by giving them steps on how to improve.

ExecExam helps instructors identify areas for improvement and empowers students to acknowledge their mistakes and enhance their skills. Currently used by college students and instructors, this tool proves valuable in computer science education, particularly for Python learners.

Solving the Python-Javascript dilemma with Reflex--a modern OSS Python web framework

elvis kahoro

Reflex is the open-source (Apache 2.0 & 20k+ GitHub stars) framework empowering Python developers to build internal (data, AI, and web) apps faster - with no JavaScript required. Build both your frontend and backend in a single language, Python. (pip install reflex)

Under the hood we transpile your app into a React frontend and FastAPI backend--with all of your app logic running server-side in Python. Reflex automatically syncs your UI with your app's state through a WebSocket connection. Build and deploy apps faster with 60+ UI components, AI, and single command deployment.

A visual exploration of vectors

Pamela Fox

Sunday 10 a.m.–1 p.m. in Hall A

Vector embeddings are a way to encode a text or image as an array of floating-point numbers, and they make it possible to perform similarity search on many kinds of content. This poster will take you on a journey through vectors with as many diagrams and graphs as we can fit, to help you understand the differences between embedding models, distance metrics, quantization schemes, and input modalities. All of the visuals are generated using open-source tools like matplotlib and pandas, and all the code is available in a repo for you to try yourself. ↖ Come on a vector voyage! ↗

Beautiful and Balanced: Using Color Theory in Data Visualization

Laura Fisher

Sunday 10 a.m.–1 p.m. in Hall A

When presenting data visually, it's important to choose color palettes which do not skew the viewer's perception of the data relationships you're illustrating. Our brains interpret color contextually: proximity to other colors changes the way we perceive a particular color, making it appear darker or lighter, more prominent or more demure. With some understanding of these color relationships, you can make palette choices which do not inadvertently introduce bias to the data being displayed.

In this poster presentation, we'll take a peek at some basic color theory based on the work of Josef Albers, look at what color weight is, and how you can use it to evaluate your palette choices. We'll go over some strategies for choosing color palettes that keep your data presentation both unbiased and visually pleasing, and list some resources for learning more.

Visual Prototyping for Woven Bands

Kathryn Born

Sunday 10 a.m.–1 p.m. in Hall A

How do you create a custom design for a woven band in an unfamiliar weaving style? Weaving relies on repeating patterns, so it can be easy to describe how a project should look, but actually picturing it is a different matter.

My goal was to design a band with diamonds, mountains, and other southwestern elements to be woven on an inkle loom. Plain weave on an inkle loom only uses repeating A-B patterns, which doesn’t allow for shapes like a diamond. I wanted to weave with a style where the pattern is built-in when you set up the loom with the warp, rather than manipulated by hand for each row, which is more error-prone. I'd heard of an A-B-A-C pattern style known as Turned Krokbragd, which could work. This weaving style is originally from Sweden, so finding existing patterns for my southwestern USA theme would be unlikely.

This Python project let me visualize my own patterns, try out different colors, and add additional design elements like a border around a central pattern. The prototyping tool creates a more realistic visual by accounting for the physical spacing of the warp threads.

Weather Data along the International Space Station Orbits

Sunday 10 a.m.–1 p.m. in Hall A

We write a Python workflow that performs several web scraping tasks to gather future positions of the International Space Station (ISS) and weather forecast data (surface temperature, surface pressure, cloud coverage, etc.) on those positions. We then perform reverse geocoding to identify the countries and oceans of the individual locations. The collected data are combined to create interactive maps where along ISS future orbits, we can pinpoint at any location, the weather conditions and the country (and associated flag) or the ocean. We also plot on the maps the major world cities along orbits. This work can be used by individuals who want to know when ISS will fly over their cities and what will be the weather.

Atom to Innovation: Machine Learning for Materials Science

Manoj Pandey

Sunday 10 a.m.–1 p.m. in Hall A

Materials science is traditionally limited by time-consuming experimental methods for predicting new material properties. Our poster demonstrates how machine learning can dramatically accelerate materials discovery through computational techniques. By using Python libraries like PyTorch, DScribe, and ASE, we will simulates and predict material properties before they're physically created.

Imagine designing materials for solar panels, electronic devices, or medical implants through a computational model. Our approach uses neural network to transform atomic structures into predictive insights, identifying patterns that human researchers might miss. We can predict critical material characteristics like mechanical strength, electrical conductivity with unprecedented speed and accuracy.

The poster will help connect computational methods with practical scientific research, providing an interactive demonstration of how machine learning can enhance materials design.

Flet: cross-platform apps in Python with Flutter UI

Feodor Fitsner

Sunday 10 a.m.–1 p.m. in Hall A

Flet is an open-source application framework for Python with Flutter UI to build adaptive cross-platform applications from a single codebase.

The poster will introduce Flet framework to the visitors.

The poster will include: - A visualization of how the same Python program can be packaged and deployed to iOS, Android, macOS, Windows, Linux and web. - Project statistics such as GitHub stars, Discord users, # of contributors, # of PyPI downloads. - Visualization of extensibility model and connection to Flutter/Dart ecosystem. - The cloud of available binary packages (wheels) for iOS and Android (NumPy, Pydantic, Pandas and others).

Building low-carbon websites with Wagtail & Django

Vince Salvino

Sunday 10 a.m.–1 p.m. in Hall A

How much do you know about websites’ carbon footprint? The web’s environmental sustainability is an important consideration for some, and an afterthought for others. But everyone needs a better understanding of exactly what moves the needle – and simple things that can reduce one’s footprint.

Wagtail is an open source content management system built with Python and Django. In recent years, Wagtail has made huge investments in lowering websites’ carbon footprint. We will show some of the background research, major sustainability improvements in Wagtail, and how to leverage these capabilities in Wagtail to reduce your own website's carbon footprint, including: - Automated checks that are available to identify improvements - Images: beyond file size, what moves the needle - Tools to measure the energy consumption of Python code specifically

Painting with Python (poster)

Caleb Madrigal

Sunday 10 a.m.–1 p.m. in Hall A

Caleb is a mathy lenticular hologram artist, here to share the basics of doing art with Python. Each piece is a self-contained Python script that uses as few dependencies as possible (mostly just Python Image Library). No GPU acceleration. No AI. Just good old fashion vanilla Python. I'll be open-sourcing (for the first time) a few of the pieces of art I've done. For a preview of what mathy lenticular holograms look like, check out https://gods.art.

Improving in Chess using Python

Adarsh Divakaran

Sunday 10 a.m.–1 p.m. in Hall A

Chess, one of the most iconic board games in history, has a global community nearing a billion players. With the help of powerful chess engines, computers have democratized access to advanced chess analysis tools - tools that even top grandmasters use to analyze and improve their games.

The good news is that these top chess engines are free, and as Python programmers, we can utilize them to enhance our own chess skills. Chess positions and moves are denoted using standard notations like PGN and FEN, making it easier to analyze and visualize them programmatically.

This poster will explore how to use Python to: - Parse chess games using the python-chess library.
- Analyze games with the help of chess engines to identify mistakes and learn from them.
- Visualize games and positions using the chess-board library.