PyCon 2019 in Cleveland, Ohio

Posters

Amazing world of Animation - powered by Python

sreenivas alapati
Sunday 10 a.m.–1 p.m. in Expo Hall

Do you know, your favourite superheroes in Avengers, cute characters of Kung Fu Panda and the epic series Game of Thrones are brought to screen with the help of python? If you are into gaming, you need to thank python for the characters you have played and the world you have explored. Even the next generation technologies like AR and VR uses python to deliver their magic to you in new formats. It won't be a overstatement if we say... "**python is the backbone of the animation Industry**" We will go behind the scenes and see how our favourite programming language is used in the animation industry. Why it plays a huge role and the kind of applications built with it.

A Proposed Method to Generate Images for Abstract Technology Terms

Itay Livni
Sunday 10 a.m.–1 p.m. in Expo Hall

Pictures are a key component in learning. In education when available, images are widely used to teach and discuss concepts. For example images containing, the combination of sun, plant, and water to describe the process of photosynthesis are widely available. However, as learning progresses and concepts get more abstract and complex i.e. quantum computing, images to describe these concepts are less concrete and not necessarily universally understood. One reason for a lack of educationally-purposed images is that drawing abstract concepts require a deep understanding of the concept and visual arts skills. We propose to build an automated algorithm that creates meaningful images for difficult concepts in technology. Advancements in natural language processing (NLP) and text to image generation algorithms (TIG) could allow us to create images for educational activities, thus allowing students to build upon complicated concepts at an accelerated pace. This experimental approach focuses on the quality and mix of data being fed into the TIGs. One advantage of this method is that the training data for the TIG is specific and clean, meaning that the machine already has a strong body of knowledge with which to produce relevant results. The proposed algorithm, written in python, has two principal components: A data feed handler comprised of a [Learning Map][2] and various [text to image generation algorithms (TIG)][2] The poster will demonstrate the outcome of generating an image e.g, quantum using this approach and the tools we used to build the model. [2]: https://github.com/taoxugit/AttnGAN [1]: https://patents.google.com/patent/US20180102062A1/en?assignee=itay+livni&oq=itay+livni

Association Rules Mining Using Python Generators and Pandas to Handle Large Datasets

Srivignessh Pacham Sri Srinivasan
Sunday 10 a.m.–1 p.m. in Expo Hall

Association rule mining with apriori algorithm is a standard approach to derive association rules. The basic implementations of the algorithm with pandas involving splitting the data into multiple subsets are not suitable for handling large datasets due to excessive use of RAM memory. Hence, the algorithm fails to execute. However, the use of the python generator makes it possible to implement and process one value at a time, discard when finished and move on to process the next value. This feature makes generators perfect for creating item pairs, counting their frequency of co-occurrence and determining the association rules. A generator is a special type of function that returns an iterable sequence of items, unlike regular functions which return all the values at once (eg: returning all the elements of a list). A generator yields one value at a time. To get the next value in the set, we must ask for it - either by explicitly calling the generator's built-in "next" method, or implicitly via a for loop. This is a great property of generators because it means that we don't have to store all of the values in memory at once. This efficient implementation is tested in Market Basket Analysis Dataset for various minimum support thresholds.

Automatic Detection of Pseudo-tested Methods using Python and Pytest

Nicholas Tocci, Gregory M. Kapfhammer
Sunday 10 a.m.–1 p.m. in Expo Hall

Test suites are a critical part of the development process. They help developers ensure that the behaviors that they have coded are working the way that they have intended. Test suites can help guide development towards a working system if implemented in the early stages of development, such as using Test-Driven-Development. If test suites are a measure for how well a system is working, how can you tell if the test suite is working? One solution is a developer can look to coverage to see how much of their system is being executed during the tests. One major problem can cause this type of benchmark to not give an accurate depiction of the fault detection effectiveness of the system. This issue is called a pseudo-tested method. A pseudo-tested method is a method that is being called in a test case that will never fail, thus pseudo-testing it. It is "pseudo-tested" because the developer thinks that the method is being tested because it appears in a test case, but the test is not accurately detecting if there is an error and is providing a pass every time it is run. Pseudo-tested methods are very difficult to detect. In fact, they are hard to detect because of the nature of them. They always pass, which begs the question, "Why investigate something that is not broken?" The only real way is to find it manually. Which includes two ways: a developer is looking for a problem in the test suite, or if they know that they made an error, but notice that the test never fails. There is a way to detect them automatically for the Java programming language, but none for Python. Python also has the potential to contain more pseudo-tested methods because of the loosely-typed nature of the language. For this reason, pseudo-tested methods could be present in many test suites, including ones with a high coverage percentage. Which led to the creation of Function-Fiasco. Function-Fiasco is an automatic detection system that finds pseudo-tested methods in Python based systems that are being tested under the Pytest framework. It finds methods in the system that is being observed and checks the return type. Once it has this, it randomizes an output that matches the return type to see if test cases that use this method still pass. If tests do still pass, the method is considered pseudo-tested. Pseudo-testing is an issue that has the potential to be found in a vast amount of systems, that are both of large and small scale. Function-Fiasco was created in Python for Python to help mitigate this risk.

Chirps: A Twitter Bot Framework written in Python

Saurabh Chaturvedi, Parth Shandilya
Sunday 10 a.m.–1 p.m. in Expo Hall

Twitter bots are a powerful way to up your social media game as well as extracting information from the microblogging network. By leveraging Twitter’s versatile APIs, a bot can do a lot of things - tweet, retweet, “fav-tweet”, follow, reply automatically and much more! When done in the right way, the combination of the above actions can be of great utility. Even though some bots can abuse their power and give a negative experience to other users, research shows that people view Twitter bots as a credible source of information. For example, a bot can keep your followers engaged with content even when you’re not online. Some bots even provide critical and helpful information (e.g, @EarthquakesSF). It is estimated that bots account for about 24% of all tweets on Twitter. While there are lots of Twitter bots out there executing different tasks - from tracking politically invoked Wikipedia edits to giving AI-driven answers, most bots are based on certain rudimentary actions. For example, every bot (obviously) tweets. Many also reply to celebrities. Majority of them mine useful information from rest of the web and tweet it out. Thus, there arises a need of a tool that allows us to quickly build such bots (instead of coding them from scratch) and deploy them with ease. This was our motivation behind developing Chirps - a flexible and open source Twitter bot building framework written in Python3. In this poster presentation, we will demonstrate how you can set up your personalized Twitter bot with Chirps. Later we’ll dive into some interesting implementation details of our framework. We will first go through the process of obtaining the Twitter APIs’ credentials, then feeding information to Chirps for customizing your bot behavior, and even some basic monetization (yes, a Twitter bot can even buy you a cup of coffee)! Later we’ll talk about how we combined the concepts of multi-threading paradigm, Python’s generator functions, web scraping, some SQL and even a bit of NLP to build this efficient, robust and (of course) Pythonic Twitter bot framework, aka Chirps. Moreover, we’ll also give ideas about possible extensions to Chirps so that you can come up with your own (and perhaps better) bot framework and deploy your own army of amazing bots 🙂. We’ll also mention about certain important responsibilities which come with owning Twitter bots (also referred to as “botiquette”). Finally, we’ll discuss some ideas for using bots and describe their prospective implementation with Python and Chirps as well as some popular bot deployment methodologies. After all, when it comes to building Twitter bots, the sky is the only limit!

CPython memory structure

Janis Lesinskis
Sunday 10 a.m.–1 p.m. in Expo Hall

A poster that has an overview of how CPython memory management works. We will show you where objects are stored and what those objects look like in memory. This will help explain the underpinnings of the way in which equality and identity checks work in CPython and also how Python objects get stored in RAM.

Designing Robust Machine Learning Algorithm to Detect Rare Events Using PyTorch

Haque Ishfaq, Atia Amin, Hassan Saad Ifti, Hassan Sami Adnan, Samara Sharmeen
Sunday 10 a.m.–1 p.m. in Expo Hall

Imagine you are an oncologist specialized in breast cancer. After hearing about how all the recent advancement in artificial neural network and machine learning, is revolutionizing medical diagnostics, you decided to try out a machine learning system that can tell which type of breast cancer the patient has just by looking into her breast mammogram images. As you encounter a new patient, with all your excitement, you use the new machine learning system to see which type of breast cancer the patient has. The algorithm says it's a benign case. Out of curiosity you take a second look at the mammogram image yourself and your years of medical training instantly tells you that the algorithm must be wrong. It cannot be benign but at the same time you are not sure which type of cancer it is. After digging up, you realize that the machine learning algorithm was trained only using the 10 most common types of breast cancer and your patient turns out to have a very rare type of breast cancer that you seldom encounter. Will you ever trust a machine learning system again to diagnose cancer? For high-stake application like this, the usual classification based machine learning algorithms are not enough. Instead we need a method that can learn high quality low dimensional representation of the data where we can achieve accurate clustering of different classes including for the classes for which we do not have any training data. This way the rare type of breast cancer we mentioned earlier would form its own cluster in the learned representation space and we would automatically be able to differentiate it from the other common types of cancer. To achieve this, in this project, we develop a generative model which would be able to learn latent representation space under which points coming from the same class are near each other and points coming from separate classes are far apart. We develop a novel loss function for training Variational Autoencoder (VAE) based generative models. The novel loss function exploits ideas from metric learning literature where instead of maximizing classification accuracy, neural networks are trained to map images coming from the same class to same regions in the learned latent representation space. Using our new VAE model, we can learn low dimensional latent representation for complex data that captures intra-class variance and inter-class similarities. The ability to learn such high quality low dimensional representation for any data would reduce any complex classification problem to simple clustering problem. All our experiments in this project were carried out using Python and its different libraries. In particular we make extensive use of PyTorch, a Python based Deep Learning framework. We believe that our approach can benefit diverse communities attending PyCon who are looking for ways to integrate machine learning algorithms to solve similar tasks that our approach is designed to tackle. In our poster, we will showcase the relevant Python tools one could use to reproduce our experiments and tackle similar tasks in their domains.

Discovering the nuclear reactor’s stability with SLEPc and Python

Javier Jorge
Sunday 10 a.m.–1 p.m. in Expo Hall

Different engineering models such as physic simulations or information retrieval rely on modelling and solving large-scale sparse eigenvalue problems, such as fluid simulation or document retrieval. SLEPc (http://slepc.upv.es) is a software library for the solution of this kind of algebraic problems on distributed computers. We can use this with Python through slepc4py to provide solutions to these computationally expensive problems, using parallelization with different schemes. In this poster we introduce slepc4py (https://bitbucket.org/slepc/slepc4py), a python wrapper for SLEPc, along with the problem of determining the nuclear reactor’s stability, a problem that is modeled as obtaining the eigenvalues of certain matrices that are large and sparse. We introduce the techniques that are implemented in SLEPc for solving a problem with these characteristics.

Don't Hear Music! See It!

Maha Mdini
Sunday 10 a.m.–1 p.m. in Expo Hall

The songs we hear today are very complex and rich: multiple instruments, sound effects and voices. As we hear a song, we may enjoy the overall sound, however, we are not aware of every audio component (words, tones, etc) composing it. But, what if we are able to identify the single instrument or artificial effect that made us love the song? What if we are able to find that particular effect or frequency that brothers our ears? Or probably create a mathematical model for music tastes? In this proposal, we aim to analyze songs using Python (Machine Learning, Signal Processing and Audio libraries). The idea is to use Machine Learning techniques and Signal Processing tools to have a better view of the songs we hear. In the poster, we will show how to analyze music in both time and frequency domains. We will visualize music, as we do with any other type of data. We will decompose it into its building components. Then we will apply Machine Learning to find similarities between songs, model music tastes and study the evolution of music characteristics over time (example: popular frequencies in each decade). There will be also a live demo during the session.

Escape from auto-manual testing with Hypothesis!

Zac Hatfield-Dodds
Sunday 10 a.m.–1 p.m. in Expo Hall

If we knew all of the bugs we needed to write tests for, wouldn't we just... not write the bugs? So how can testing find bugs that nobody would think of? The answer is to have a computer write your tests for you! You declare what kind of input should work - from 'an integer' to 'matching this regex' to 'this Django model' and write a test which should always pass... then [Hypothesis](https://hypothesis.readthedocs.io) searches for the smallest inputs that cause an error. If you’ve ever written tests that didn't find all your bugs, find me at the poster. It covers the theory of property-based testing, a worked example, and a whirlwind tour of the library: how to use, define, compose, and infer strategies for input; properties and testing tactics for your code; and how to debug your tests if everything seems to go wrong.

Everything End-user Computing with Python and JupyterHub

Lucas Durand
Sunday 10 a.m.–1 p.m. in Expo Hall

Providing an end-user computing environment for on-demand Data Science capabilities in a heavily regulated industry comes with inherent challenges. Not only are we trying to appeal to industry professionals from different backgrounds, from software developers to ML & Analytics experts, we also want to appeal to first-time users that might not have the permissions or know-how to set up a local python installation. All of them are important members of the growing python community with different goals and needs. We think the answer to this is to **give everyone python**, no questions asked. Regulatory concerns are handled with a custom implementation of JupyterHub, which we also extend to form the backbone of a complete solution delivery pipeline. **Highlights include:** custom authenticators, integrating with logging/monitoring stacks, police bots, collaboration tools, and a real pipeline to get from ideation to production.

Exploring Scientific Databases with Python

Andrey Smelter
Sunday 10 a.m.–1 p.m. in Expo Hall

Some of the existing scientific databases provide scholarly data deposited in a specialized file format. This is due to various reasons, for example the database and file format were developed prior to modern open data serialization formats and languages, poor design practices (not invented here principle). As a result, this prevents the end user from ease of access to the scientific information and full utilization of the valuable data. Therefore, data reusability for downstream analysis and knowledge integration by the scientific community are hindered. The poster will discuss - The issues of scientific data reusability and reproducibility. - Examples of the scientific databases that use specialized file format for data distribution. - Examples of open source Python libraries designed to work with databases that use specialized file format to distribute scientific data. - Examples of how this data can be converted (serialized) and potentially validated using modern open data serialization formats and Python libraries designed for schema validation. - Examples of using Jupyter, pandas, matplotlib for data exploration, data quality assurance, and data visualization.

Framework For Lossless Data Compression Using Python

Manas Malik, Adarsh Parakh, Arshad
Sunday 10 a.m.–1 p.m. in Expo Hall

A lot has been done in the field of data compression, yet we don’t have a proper application for compressing daily usage files. There are appropriate and very specific tools online that provide files to be compressed and saved, but the content we use for streaming our videos, be it a Netflix video or a gaming theater play, data consumed is beyond the calculation of a user. Back-end developers know all about it and as developers we have acknowledged it but not yet achieved it in providing on an ease level. Since the user would not never be concerned about compression, developers can always take initiative while building the application to provide accessibility with compression before-hand. We have decided to create a framework that will provide all the functionality needed for a developer to add this feature. Making use of the python language this process can work. I’m a big fan of Python, mostly because it has a vibrant developer community that has helped produce an unparalleled collection of libraries that enable one to add features to applications quickly. The Python zlib library provides a Python interface to the zlib C library, which is a higher-level abstraction for the DEFLATE lossless compression algorithm, we have a lot to do including the audio, video and subtitles of the file. We also make use of the fabulous ffmpy library. ffmpy is a Python library that provides access to the ffmpeg command line utility. ffmpeg is a command-line application that can perform several different kinds of transformations on video files, including video compression, which is the most commonly requested feature of ffmpeg. Frame rate and audio synchronization are few other parameters to look closely. This is an ongoing project and there remains few implementation aspects, data compression remains a concern when touched upon the design. We along with python community intend to solve this issue.

Griode: the musical instrument for not-yet-musicians

Jérôme Petazzoni
Sunday 10 a.m.–1 p.m. in Expo Hall

In 2017, I started playing music with a [LaunchPad](https://novationmusic.com/launch/launchpad). A LaunchPad is a MIDI controller, which for all intents and purposes is basically an 8x8 grid of buttons which can also light up in different colors, thanks to individual LEDs. Since it doesn't have any sound-producing ability on its own, I would connect it to my laptop, and run something like Ableton or Bitwig (two fine, but proprietary, pieces of software). This would give me a portable, versatile musical instrument. In 2018, I started to think that it would be nice to replace the laptop with something smaller (like a Raspberry Pi). It would also be cheaper, and more suitable for use by small children. There was just one little problem: the software I was using could not run on a Raspberry Pi, so I wrote my own. The result is an Open Source project called "Griode." [It's on GitHub](https://github.com/jpetazzo/griode), and it serves at least three kinds of users: 1. People like me, who want to play music on the go with something rugged enough to withstand frequent travel, and small enough to fit in a regular backpack. 2. People like my nephews (5 and 8 years old at the time), who can learn to play tunes on it using a [Simon](https://en.wikipedia.org/wiki/Simon_(game%29)-style interface. 3. People who don't know chords and scales, but who want to improvise along with other musicians. After you punch in the note and scale that you want to use, Griode lights up the buttons corresponding to the notes in that scale, making it easy for you to play them even if you don't know anything about music theory. Under the hood, Griode makes use of many existing Open Source components like mido (for MIDI interfacing) and FluidSynth (to generate sounds). It is built around an event loop, and allows multiple controllers to be connected at the same time. The UX is provided by "gridgets" (widgets on a grid!) arranged by something that looks like the distant cousin of a compositing window manager. Come see and hear and play with Griode; I will have at least two controllers for folks willing to try it out. If you're still wondering what this could look (and sound) like, I made a [handful of short videos](https://www.youtube.com/playlist?list=PLBAFXs0YjviK9PzKnr3MDsRU6YAJgeH1K) (2 minutes each) to showcase its main features.

Implementing a Chatbot for Positive Reinforcement in Young Learners

Francisca Onaolapo Oladipo
Sunday 10 a.m.–1 p.m. in Expo Hall

As a result of the Boko Haram insurgency, several families had been displaced and many children no longer have access to formal classroom-based education. They are therefore exposed to predators who in turn influence them negatively and corrupt their innocent minds through wrong teachings. The aim of this research is to stop the use of children as suicide bombers by the Boko Haram terrorists in Northern Nigeria through de-radicalization and game-based learning. Children love games –computer games, mobile games... as they are very responsive, and therefore can be deployed in teaching best behaviors by stimulating learners’ involvement. So why not build games using python? In this poster, the author shall be discussing the development of an interactive chatbot trained with the corpus of three local languages (Hausa, Fulfude, and Kanuri) and English (with translations both ways) to stimulate conversations, deliver tailored contents to the users thereby aiding in the detection of radicalization giveaways in learners through data analysis of the games moves and vocabularies. The presentation would show how the chatbot can tell the degree of radicalization in an individual and tailor the contents towards such user's need. The work leveraged on the affordances of mobile devices in Nigeria to build conversational agents that interact with kids living in Internally Displaced Person Camps in North East Nigeria. The chatbot is also being used for security communications and as a natural communications framework for teaching local languages to non-native humanitarian aids workers.

Making Python libraries machine accessible

Zebulun Arendsee, Andrew Wilkey, Jennifer Chang
Sunday 10 a.m.–1 p.m. in Expo Hall

In a future with strong AI, what will be the role of the programmer? We believe programmers should work with machines by giving them the knowledge they need to reason about our programs. Natural language documentation is imperfect even for humans, since it can be ambiguous and can fall out of sync with the code base. In this poster, we will show our approach to layering an elegant, knowledge representation-based, type system on Python libraries without touching the Python source. We will show how this semantic information can be used to * formalize documentation and make it machine and human searchable * automatically generate runtime assertions * seamlessly integrate with other languages * and lots more Let humans do the fun work, describing problems and building algorithms, and let machines handle the details.

Moving Neuroscience Forward with Python

Emily Irvine
Sunday 10 a.m.–1 p.m. in Expo Hall

Neuroscientists have relied on Excel for simple analyses and MATLAB for their programming needs for decades. Other data intensive domains like astrophysics and machine learning have embraced Python but many neuroscientists are hesitant to make the switch and continue to use legacy code in MATLAB. When I started doing neuroscientific data analysis in 2012 I started with Python because of the mature scientific Python stack and its community that encourages transparency and reproducible pipelines. Since then I have worked on projects ranging from analyzing how rats behave in conditioning experiments to detailed electrical recordings of individual neurons. Python has lent itself to elegant solutions for these two extremes and many domains in between. Some parts of the analysis pipeline come up frequently in all types of neuroscientific experiments, whether it's high level behavior or low level recordings and I have assembled a core set of these utilities in a Python package called Nept (NeuroElectroPhysiological Tools). My goal is to provide a well documented and tested base to which others can contribute and help grow the community of Pythonistas in neuroscience.

Part-of-Speech Tagging Using Machine Learning

Ryan Baxley
Sunday 10 a.m.–1 p.m. in Expo Hall

One of the big problems with AI is finding a way to communicate with machines. Natural Language Understanding (NLU) is the challenge of granting reading comprehension to machines. One of the primary challenges in NLU is part-of-speech tagging. Identifying the appropriate part of speech for each word in a sentence provides requisite context for extracting useful information from text. In this project, we built a part-of-speech tagger using Keras and TensorFlow in Python, and explain the significance of our particular machine learning model architecture choices along the way.

Prospects of an impulse sensoring mechanism in structural composites.

Shah Rukh Shahbaz, Dr. Omer Berk BERKALP
Sunday 10 a.m.–1 p.m. in Expo Hall

The world is moving towards the demand for long-life and reliable components for structural needs in construction, aircraft and renewable energy areas which are being fulfilled by researchers and industries working in this field. The strength of the material can signify a good importance at the start of its service life but following that, it needs to be inspected in schedules which causes production losses and limitations for mobility. Exposure to adverse environment can also deteriorate its mechanical stability instantaneously during service life. Lets suppose if a strong impact damages an off-shore wind turbine blade or an increased pressure creates a leakage in an underground water supply pipe line, then how can we immediately detect the location and type of damage occurred? Usually it will come under investigation during a scheduled inspection or if it maximizes to reveal itself on a bigger scale. Since bigger, reliable and remote structural needs are gaining demands, a real time data acquisition approach towards the remaining strength and detection of damages in structural components is a prime requirement so that structures could inform the observer prior to complete failure or any big losses. When we talk about the composites, there are some multifunctional self sensing and structural components available which are gaining demand for smart applications. Such approaches along with the development of suitable interface can conveniently fulfill the needs of future composite industry.

Python Boot Camp: Introduction of efforts to spread Python all over Japan

Takanori Suzuki, Manabu TERADA
Sunday 10 a.m.–1 p.m. in Expo Hall

[Python Boot Camp](https://www.pycon.jp/support/bootcamp.html)(#pycamp) is a Python beginner course held all over Japan. Python Boot Camp is organized by **PyCon JP Committee** and has been held 30 times in 28 regions starting from 2016. The purpose of the event is to expand the Python community throughout Japan, and as a result several local communities were established. In this poster session I will share about Python Boot Camp's achievements, how it is organized and how we devised it. And I would like to discuss about how to extend this idea beyond Japan, how to expand in Japan more, next steps etc.

Real-time voice-to-musical-instrument translation using Python

John Carelli
Sunday 10 a.m.–1 p.m. in Expo Hall

Is it possible to sing a melody line and to have a recognizable musical instrument, such as a trumpet or a saxophone, reproduce that melody in real time? Further, could it be done in a sufficiently expressive manner that it could be used in a live musical performance? Those are the goals of the project that will be demonstrated in this poster session. The software detects the sung notes and drives a software based player containing a user selected musical instrument to play those pitches, in real time, as they are sung. The software contains specially developed heuristics for pitch stabilization and onset detection to overcome eccentricities in the singing voice such as vibrato, inaccurate attack, inaccurate pitch (sharp or flat), and pitch drift. Unlike “pitch-to-midi” software, the distinguishing feature of this project is the focus on live performance as opposed to the specific generation of MIDI pitch information - which affects how detected pitches are used to drive instrumentation. Python is uniquely suited to this effort with its extensive capabilities and wide variety of supporting libraries. Internally, this project takes advantage of supplemental libraries for streaming audio (pyaudio), basic pitch recognition (aubio), MIDI (mido), threading (to manage latency in pitch detection and playback), as well as a number of other basic libraries. Features and capabilities: <ul> <li>A GUI interface that controls overall functionality. It is implemented using the Kivy application development library for Python.</li> <li>A built-in instrument player with several pre-programmed instruments.</li> <li>An interface to an external instrument player. Pitch information is sent using the MIDI protocol. </li> <li>An Arduino-based, programmable, hardware controller designed specifically for the singer to use in performance. It uses the Python serial library to interface with an Arduino device and provides the singer control over a variety of aspects of musical expression including volume, octave shifts, sustain, and pitch interpretation. Additionally, the programmability allows the singer to adjust controls to taste, or to control other aspects of musical expression, such as sending MIDI control messages to an external player, if one is used.</li> </ul> The app will be available both for demonstration and for visitors to try for themselves.

Recipe to deliver a “project based learning” STEM experience to High School students

Meenal Pant, Aarav Pant, Anay Pant
Sunday 10 a.m.–1 p.m. in Expo Hall

Being a STEM educator for middle and high school , I am constantly looking to create a curriculum and content that makes learning to program a fun and hands on activity for students. Students in this age group typically lose interest or find it difficult when following a traditional form of learning e.g. following a book or screencast. In summer 2018, I got an opportunity via [https://www.fremontstem.org/ ][1]to run a research oriented 16 hours ( 2 hours per week) “learning with Python workshop” with high school students. This is when I set out to create a project that my team would work on , and gain programming skills while doing so. In PyCon 2018 Adafruit gave Gemma M0 microcontrollers as swag to each attendee. I found myself enjoying tinkering with it using Python with a few relatively easy steps. This tinkering propelled me into thinking of and creating a simple and easy to follow project that the students would enjoy and also be able to complete in the given timeframe. My idea , with some help from [Les Pounder's][2] blog post, was to create blingy and shiny Christmas ornaments with Gemma M0 and Neopixel rings from Adafruit, Python and 3D printing! Creating and executing this project took a “village and two tiny helpers” to make happen! This poster talks about the process , the sighs and frustrations, the aha moments and the final feeling of achieving success as a team working on and delivering a fun filled collaborative, “hands on” project. Some topics covered in the poster session will be: - Concept and Idea - Electronics with Python - Hands on with wire strippers and soldering iron - Design and printing moulds - Student showcase - Student feedback [1]: https://www.fremontstem.org/ [2]: https://bigl.es/ [3]: https://www.fremontstem.org/blog/categories/asdrp

Simulation model for 3D-printed scaffold development for personalised medicine using Python

Hassan Sami Adnan, Atia Amin, Haque Ishfaq, Hassan Saad Ifti, Samara Sharmeen
Sunday 10 a.m.–1 p.m. in Expo Hall

Complimentary to oral drugs, personalised 3D-printed formulations can add value to patient demands where the release of the drug can be controlled and optimised to the patient’s health needs. Research is being done to establish drug delivery mechanisms that provide sustained- and controlled-release profiles of active pharmaceutical ingredients using 3D printed scaffolds and similar additive manufacturing. Designing such scaffolds and prototypes can be time-consuming and costly given the novel approach and emerging equipments and technologies that are necessary. Furthermore, failed prototypes cannot be changed once printed limiting researchers to try out different configurations or change other design factors. The required wall-topology of the scaffolds is dependent on the fluidic behaviour of the inner ingredients in liquid form whilst exiting through the scaffold wall. A practical method to drive this releasing process is to store the inner liquid at a higher pressure than the average ambient pressure in the stomach. The rate at which the liquid is released is determined by this pressure differential and the wall-topology, i.e. the passage diameter, length, surface-roughness, and structure. This fluidic behaviour can be modelled by a one-dimensional equation where the pressure differential is proportional to the releasing rate of the liquid-volume. The factor of proportionality here is intrinsically dependent on the wall-topology. Hence, determining this factor for a given pressure differential and releasing rate of the drug can directly suggest the required wall-topology for the scaffold. This model will instantly enable researchers to start testing their designs for effectiveness and efficiency before the prototype is printed, thus decrease waste, financial burden, and time consumption. To our knowledge, this is the first time that this approach has been taken. At PyCon, we foresee to present our model in form of a poster. The model is simulated using Python and the Gauss-Seidel algorithm. This model further demonstrates a novel combination of the field of medicine and fluid dynamics, where Python—as an open source language—acts as a viable bridge. We believe this will showcase the limitless possibilities of Python and enable us to connect with similarly motivated Python enthusiasts.

The Adventures of a Python Script!

Dema Abu Adas
Sunday 10 a.m.–1 p.m. in Expo Hall

Have you ever wondered what happens between the time you run helloWorld.py and the terminal prints out “Hello world”? I will be sharing the wonderful and interesting process of how the Python interpreter works from the Python source code to the compilation of bytecode. Steve Yegge, a programmer and blogger who has a plethora of experience in operating systems, once noted the importance about compilers by stating, “If you don’t know how compilers work, then you don’t know how computers work”. This talk will share an overview of how CPython works from lexxing to compiling as well as how the abstract syntax tree (AST) works. At the end you’ll be able to understand the general concept of the abstract syntax tree (AST) and how creating a interpreter can additionally benefit you in ways unrelated to the actual compilation such as linting and debugging.

The Forest Fire Alarm System Using Drones and TensorFlow Python

Gamal Bohouta
Sunday 10 a.m.–1 p.m. in Expo Hall

The Drones also known as Unmanned Aerial Vehicles (UAVs), or Unmanned Aerial Systems (UASs) are growing rapidly over the past few years and are expected to keep growing in the very near future. The drone market is expanding with new models that target different segments of the consumer and commercial market, such as wildfire alarm system, climate change, aiding in wildlife preservation, improving agricultural management, insurance adjustment, real estate surveying, automated deliveries, roof inspections, tracking disease outbreaks, and directing disaster relief. Also, when the drones are combined with a super machine, high-definition cameras, smart sensors and devices, cloud AI resources, and other new technologies, the drones become remarkably more advanced. However, this proposal presents the Forest Fire Alarm System that will be used to connect a drone to the IBM Watson cloud for detecting and alerting the Fire Department about smoke, fire or other emergencies. The Forest Fire Alarm System has two main models. The first model is a drone model that uses machine learning algorithms using python TensorFlow and OpenCV with some machine learning libraries. This model runs on a drone that has AeroBoard with Nvidia Jetson TX2, a high-definition camera, Forward-looking infrared (FLIR) camera, smoke detector sensor, professional microphone, and LTE/4G modem. The second model is a cloud model that runs on the IBM Watson cloud with some IBM APIs such as visual recognition, Internet of things, data analysis and other tools. The main aim of this proposal is to build a drone that can fly by itself using autonomous fleet management. When the drone is in the sky and starting its mission, the drone model will analyze and gatherer the videos, voices and data in real-time from the forest by using python TensorFlow and OpenCV with some machine learning libraries, then extracts frames of the videos that collect by the cameras, the feature of voices that collect by the microphone, and the data that collect by the sensors; then the drone model will send all frames, features and data to the cloud model in the IBM Watson cloud server for analyzing all them, and making the final decision, then sending the result to the Fire Department dashboard.

Thinking Inside the Box: How Python Helped Us Adapt to An Existing Data Ingestion Pipeline

Eddie Schuman
Sunday 10 a.m.–1 p.m. in Expo Hall

We will cover how we used Python to adapt to a large institutional processing setup. We used Python to create the definitions, configuration files, and supplementary metadata for each of the weather radars we worked with. We used a variety of custom tools to interface with existing systems and processes that would have been infeasible to work with otherwise. We took advantage of one of Python’s greatest strengths: its flexibility. We used it to perform the bulk of our data processing with NumPy, created custom utility functions to encourage code reuse, and created custom scripts for interfacing with the institutional data processing framework we worked within.

Unsupervised Clustering of Cancer Types from Gene Expressions using Variational Autoencoders

Shruthi Ravichandran
Sunday 10 a.m.–1 p.m. in Expo Hall

<p> While many other diseases are relatively predictable and treatable, cancer is very diverse and unpredictable, making diagnosis, treatment, and control extremely difficult. Traditional methods try to treat cancer based on the organ of origin in the body, such as breast or brain cancer, but this type of classification is often inadequate. If we are able to identify cancers based on their gene expressions, there is hope to find better medicines and treatment methods. However, gene expression data is so vast that humans cannot detect such patterns. In this project, the approach is to apply unsupervised deep learning to automatically identify cancer subtypes. In addition, we seek to organize patients based on their gene expression similarities, in order to make the recognition of similar patients easier. </p> <p>While traditional clustering algorithms use nearest neighbor methods and linear mappings, we use a recently developed technique called Variational Autoencoding (VAE) that can automatically find clinically meaningful patterns and therefore find clusters that have medicinal significance. Python-based deep learning framework, Keras, offers an elegant way of defining such a VAE model, training, and applying it. In this work, the data of 11,000 patients across 32 different cancer types was retrieved from The Cancer Genome Atlas. A VAE was used to compress 5000 dimensions into 100 clinically meaningful dimensions. Then, the data was reduced to two dimensions for visualization using tSNE (t-distributed stochastic neighbor embedding). Finally, an interactive Javascript scatter plot was created. We noticed that the VAE representation correctly clustered existing types, identified new subtypes, and pointed to similarities across cancer types. This interactive plot of patient data also allows the study of nearest patients, and when a classification task was created to validate the accuracy of the representation, it achieved 98% accuracy. The hope is that this tool will allow doctors to quickly identify specific subtypes of cancer found using gene expression and allow for further study into treatments provided to other patients who had similar gene expressions. </p>

Uplink: A Declarative API Client Builder for Python

Raj Kumar
Sunday 10 a.m.–1 p.m. in Expo Hall

### Summary [Uplink](https://github.com/prkumar/uplink) is an open source framework for building useful, structured API clients using your HTTP library of choice. ### Yet Another HTTP Client? In the world of Python web development, making *ad hoc* HTTP requests is simple. Libraries like [Requests](https://github.com/requests/requests) and [aiohttp](https://aiohttp.readthedocs.io/en/stable/) make doing this sort of work super easy. However, writing a *structured* client for an HTTP API served by your organization or an upstream service demands a little more thought. In the simplest form, there's the challenge of minimizing duplicate code for things like URL templating and setting default request headers. Adding more advanced functionality like deserialization and authentication into the mix makes the problem that much more challenging. ### A Framework for Building API Libraries This is where Uplink comes in! Sitting on top of libraries like Requests and aiohttp, Uplink provides an abstraction that minimizes a lot of the overhead: - Define an API client as a Python class. - Each method is a separate request template, and the method's arguments can parametrize request attributes, such as headers, query parameters, and the request body. - Use decorators and function annotations to declare request attributes, response expectations, error handling, et cetera. - Deserialization can be as simple as adding a return annotation. - URL templating and authentication are built-in. - The same client definition can work for blocking and non-blocking I/O, depending on which HTTP library you use at runtime. - And, there's so much more! This goal of this poster is to introduce Uplink to the viewer and provide some concrete examples to illustrate the library's benefits.

Using Pylint to Write Clean and Maintainable Python Code

Shiva Bhusal
Sunday 10 a.m.–1 p.m. in Expo Hall

Python code is stereotyped to be unmaintainable as its volume stacks up. In order to make the code clean and maintainable, it is a best practice to follow a standard style guide like PEP 8. But, it is difficult to maintain the proper coding convention proposed by such style guides in a large code base without the use of any code analysis tools. Pylint (developed by Sylvain Thénault et. al.) is a static code analysis tool that helps enforce proper coding convention in Python by detecting code smells and violations of proposed standards. On this poster, I will present how Pylint can be used to ensure a clean and maintainable Python code.

Using Python and GitHub for Team Formation and Assessment

Gregory M. Kapfhammer
Sunday 10 a.m.–1 p.m. in Expo Hall

Since real-world software engineers collaborate through GitHub, students who are learning to create software should also work in teams. Yet, team formation is challenging for instructors who must surface student skills and interests when forming balanced teams. Instructors also face the challenge of quickly and accurately assessing both individual contributions and overall team effectiveness. Both the formation of unstable teams and an instructor's slow or incorrect assessment of students and their teams will compromise the learning objectives for an assignment. Since the manual creation and assessment of teams is time consuming and error-prone, instructors need automated tools to support this process. This poster will explain how to use the Python programming language and the GitHub platform to form and assess student teams. After introducing surveys that capture student interests and skills, the poster will show how to use the [GatorGrouper tool](https://github.com/GatorEducator/gatorgrouper) that forms teams according to the survey results, a student roster, and the desired number and size of the teams. In addition to introducing the behavior and trade-offs associated with GatorGrouper's techniques for group formation, the poster will highlight features like the one that accounts for student absences. Since GatorGrouper outputs teams in a format accepted by [GitHub Classroom](https://classroom.github.com/), this poster will show how to use this platform to create a version control repository for each team. After explaining how teams can collaborate with GitHub's issue tracker and flow model, this poster will highlight how to use Python to assess both individual and team effectiveness. Along with showing how to access the issue tracker, pull request, and branch history of a GitHub repository, the poster will show how the [GatorGrader tool](https://github.com/GatorEducator/gatorgrader) can use this information to ensure that, for instance, the team effectively shared the assignment's workload and each student made acceptable individual contributions. Software engineers and programming educators who visit this poster will learn how the Python language and the GitHub platform support simple ways to form and assess teams. Accessible to individuals from a wide variety of backgrounds and skill levels, this poster will explain how teachers can use tools, like GatorGrouper, GatorGrader, and GitHub, to ensure that students master the real-world skills that will enable them to be effective software engineers.

Using Python for Biomedical Image Processing

Dipam Paul
Sunday 10 a.m.–1 p.m. in Expo Hall

This poster would entail a more efficient procedure to extract the features out of a training dataset of Biomedical Images - Be it of, Heart, Lungs, Retina et al. and would work on a classification algorithm to test and diagnose the patient in question. This process does not cater to the process of removal of a Doctor's expertise, however, it just enhances up the overall efficiency of the diagnosing structure which is currently followed in the world - All using Python!