Creating Complex Data Pipelines in the Cloud: The App Engine Pipeline API
log in to bookmark this presentaton
Notably, the Pipeline API is being used by App Engine to connect the pieces of our MapReduce system. The API's key use-case is executing many MapReduces and offline processes in parallel to form a single data pipeline. This enables developers to very easily "join" data from disparate sources, which is one of the most difficult things to achieve in a distributed processing system. I'll show some specific examples of when you would want to "fan-in" and how to achieve that with the Pipeline API.