Workflow-oriented systems have many uses, including data processing and analysis, ETL, CI/CD, and more. But creating a programmatic interface to a workflow system is a delicate balancing act: we want the API to be flexible enough to support useful work, but also constrained enough that tasks run cooperatively within the larger system.
We faced this challenge when designing the task API for the Pants build system. We needed to allow custom task code to enjoy the benefits of complex features like caching, concurrency and remote execution, without every task author having to reason about them.
In this talk we'll show how we found the right balance through unconventional use of Python's type annotations, coroutines, and dataclasses. Combining these seemingly disparate features in the context of a workflow engine allows you to build elegant extensibility APIs with just the right amount of flexibility.