Sunday 10 a.m.–1 p.m. in

Managing Machine Learning Experiments

Seb Arnold

Description

Managing experimental results in machine learning can be a daunting task. Researchers and practitioners often try a variety of algorithms, hyper-parameters, and pre-processing techniques, each resulting in different outcomes. Tracking and analyzing each of these outcomes is a burden further amplified when dealing with multiple collaborators and several computer nodes. In this presentation I will share my experience of managing over a 1,200 experimental results, ran in parallel on 8 computer nodes with 5 collaborators over a time span of 6 months. I will focus on the usage of the [randopt](https://seba-1511.github.io/randopt/) package for experimental management and visualization. Specifically, I will introduce the typical randopt workflow which consists of experiment creation, hyper-parameter selection, and results visualization. Randopt is an [open-source](https://github.com/seba-1511/randopt) library for experiment management. It is written in pure Python, is dependency-free, and available on [PyPI](https://pypi.python.org/pypi/randopt). Interactive, web-based experimental reports are generated via the built-in command line utility and a programmatic API is also available. It is compatible with all Python packages, including PyTorch, TensorFlow, scikit-learn, and numpy/scipy. While randopt was developed with machine learning in mind, its agnosticity with respect to the nature of the experiments makes it suitable for general-purpose scientific experiment management.