Apache Beam edit   discuss  

Unified batch and streaming programming model to define portable data processing pipelines and execute these using a range of different engines. Originating from the Google Dataflow model, focuses on unifying both styles of processing by treating static data sets as streams (which happen to have a beginning and an end), while achieving data correctness and the ability to handle late-arriving data through a set of abstractions and concepts that give users control over estimated quality of arrived data (completeness), duration to wait for results (latency) and how much speculative/redundant computation to do (cost). Allows business logic, data characteristics and trade-off strategies to be defined via different programming languages through pluggable language SDKs (with out of the box support for Java and Python). Supports a range of pluggable runtime platforms through pipeline runners, with support for a direct runner (for development and testing pipelines in a non-distributed environment), Apache Apex, Flink, Spark, and (under development) Gearpump runners, and a Google Cloud Dataflow runner. Also supports a growing set of connectors that allow pipelines to read and write data to various data storage systems (IOs). An Apache project, opened sourced by Google in January 2016, graduated in January 2017, with a first stable release (2.0) in May 2017. Written in Java and Python and under active development with a large number of contributors including Google, data Artisans, Talend and PayPal.

Technology Information

Other NamesBeam
VendorsThe Apache Software Foundation
TypeCommercial Open Source
Last UpdatedAugust 2019 - 2.14

Release History

versionrelease daterelease linksrelease comment
2.12017-08-23release notes 
2.22017-12-02release notes 
2.32018-02-19blog post; release notes 
2.42018-03-20release notes 
2.52018-06-26blog post; release notes 
2.62018-08-08blog post; release notes 
2.72018-10-02blog post; release notes 
2.82018-10-31blog post; release notes 
2.92018-12-19blog post; release notes 
2.102019-02-01blog post; release notes 
2.112019-03-05blog post; release notes 
2.122019-04-25blog post; release notes 
2.132019-05-22blog post; release notes 
2.142019-08-07blog post 


Blog Posts