beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From da...@apache.org
Subject [2/3] incubator-beam-site git commit: Add content files missed in PR2 due to an overly aggressive .gitignore filter
Date Thu, 17 Mar 2016 23:41:18 GMT
Add content files missed in PR2 due to an overly aggressive .gitignore filter

Fix baseurl used for local staging, and s/compatability/capability/ in blog post filename

Fix one more s/compatability/capability/ typo

Fix up capability rename for reals.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/04e1fbbb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/04e1fbbb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/04e1fbbb

Branch: refs/heads/asf-site
Commit: 04e1fbbb035d6bfd53268774daa85c7c98105d93
Parents: b04b002
Author: Tyler Akidau <takidau@apache.org>
Authored: Thu Mar 17 15:54:47 2016 -0700
Committer: Davor Bonaci <davor@google.com>
Committed: Thu Mar 17 16:34:05 2016 -0700

----------------------------------------------------------------------
 _posts/2016-03-17-capability-matrix.md          |  596 +++++++
 _posts/2016-03-17-compatability-matrix.md       |  596 -------
 .../2016/03/17/capability-matrix.html           |  896 ++++++++++
 content/blog/index.html                         |    2 +-
 content/capability-matrix/index.html            | 1652 ++++++++++++++++++
 content/feed.xml                                |   10 +-
 content/index.html                              |    2 +-
 7 files changed, 3151 insertions(+), 603 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/04e1fbbb/_posts/2016-03-17-capability-matrix.md
----------------------------------------------------------------------
diff --git a/_posts/2016-03-17-capability-matrix.md b/_posts/2016-03-17-capability-matrix.md
new file mode 100644
index 0000000..9ede7d9
--- /dev/null
+++ b/_posts/2016-03-17-capability-matrix.md
@@ -0,0 +1,596 @@
+---
+layout: post
+title:  "Clarifying & Formalizing Runner Capabilities"
+date:   2016-03-17 11:00:00 -0700
+excerpt_separator: <!--more-->
+categories: beam capability
+authors:
+  - fjp
+  - takidau
+
+capability-matrix-snapshot:
+  columns:
+    - class: model
+      name: Beam Model
+    - class: dataflow
+      name: Google Cloud Dataflow
+    - class: flink
+      name: Apache Flink
+    - class: spark
+      name: Apache Spark
+  categories:
+    - description: What is being computed?
+      anchor: what
+      color-b: 'ca1'
+      color-y: 'ec3'
+      color-p: 'fe5'
+      color-n: 'ddd'
+      rows:
+        - name: ParDo
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: element-wise processing
+              l3: Element-wise transformation parameterized by a chunk of user code. Elements are processed in bundles, with initialization and termination hooks. Bundle size is chosen by the runner and cannot be controlled by user code. ParDo processes a main input PCollection one element at a time, but provides side input access to additional PCollections.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Batch mode uses large bundle sizes. Streaming uses smaller bundle sizes.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: ParDo itself, as per-element transformation with UDFs, is fully supported by Flink for both batch and streaming.
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: ParDo applies per-element transformations as Spark FlatMapFunction.
+        - name: GroupByKey
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: key grouping
+              l3: Grouping of key-value pairs per key, window, and pane. (See also other tabs.)
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "Uses Flink's keyBy for key grouping. When grouping by window in streaming (creating the panes) the Flink runner uses the Beam code. This guarantees support for all windowing and triggering mechanisms."
+            - class: spark
+              l1: 'Partially'
+              l2: group by window in batch only
+              l3: "Uses Spark's groupByKey for grouping. Grouping by window is currently only supported in batch."
+        - name: Flatten
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: collection concatenation
+              l3: Concatenates multiple homogenously typed collections together.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+              
+        - name: Combine
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: associative &amp; commutative aggregation
+              l3: 'Application of an associative, commutative operation over all values ("globally") or over all values associated with each key ("per key"). Can be implemented using ParDo, but often more efficient implementations exist.'
+            - class: dataflow
+              l1: 'Yes'
+              l2: 'efficient execution'
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: 'fully supported'
+              l3: Uses a combiner for pre-aggregation for batch and streaming.
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: Supports GroupedValues, Globally and PerKey.
+
+        - name: Composite Transforms
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined transformation subgraphs
+              l3: Allows easy extensibility for library writers.  In the near future, we expect there to be more information provided at this level -- customized metadata hooks for monitoring, additional runtime/environment hooks, etc.
+            - class: dataflow
+              l1: 'Partially'
+              l2: supported via inlining
+              l3: Currently composite transformations are inlined during execution. The structure is later recreated from the names, but other transform level information (if added to the model) will be lost.
+            - class: flink
+              l1: 'Partially'
+              l2: supported via inlining
+              l3: ''
+            - class: spark
+              l1: 'Partially'
+              l2: supported via inlining
+              l3: ''
+
+        - name: Side Inputs
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: additional elements available during DoFn execution
+              l3: Side inputs are additional <tt>PCollections</tt> whose contents are computed during pipeline execution and then made accessible to DoFn code. The exact shape of the side input depends both on the <tt>PCollectionView</tt> used to describe the access pattern (interable, map, singleton) and the window of the element from the main input that is currently being processed.
+            - class: dataflow
+              l1: 'Yes'
+              l2: some size restrictions in streaming
+              l3: Batch implemented supports a distributed implementation, but streaming mode may force some size restrictions. Neither mode is able to push lookups directly up into key-based sources.
+            - class: flink
+              jira: BEAM-102
+              l1: 'Partially'
+              l2: no supported in streaming
+              l3: Supported in batch. Side inputs for streaming are currently WiP.
+            - class: spark
+              l1: 'Partially'
+              l2: not supported in streaming
+              l3: "Side input is actually a broadcast variable in Spark so it can't be updated during the life of a job. Spark-runner implementation of side input is more of an immutable, static, side input."
+
+        - name: Source API
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined sources
+              l3: Allows users to provide additional input sources. Supports both bounded and unbounded data. Includes hooks necessary to provide efficient parallelization (size estimation, progress information, dynamic splitting, etc).
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: 
+            - class: flink
+              jira: BEAM-103
+              l1: 'Partially'
+              l2: parallelism 1 in streaming
+              l3: Fully supported in batch. In streaming, sources currently run with parallelism 1.
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: 
+              
+        - name: Aggregators
+          values:
+            - class: model
+              l1: 'Partially'
+              l2: user-provided metrics
+              l3: Allow transforms to aggregate simple metrics across bundles in a <tt>DoFn</tt>. Semantically equivalent to using a side output, but support partial results as the transform executes. Will likely want to augment <tt>Aggregators</tt> to be more useful for processing unbounded data by making them windowed.
+            - class: dataflow
+              l1: 'Partially'
+              l2: may miscount in streaming mode
+              l3: Current model is fully supported in batch mode. In streaming mode, <tt>Aggregators</tt> may under or overcount when bundles are retried.
+            - class: flink
+              l1: 'Partially'
+              l2: may undercount in streaming
+              l3: Current model is fully supported in batch. In streaming mode, <tt>Aggregators</tt> may undercount.
+            - class: spark
+              l1: 'Partially'
+              l2: streaming requires more testing
+              l3: "Uses Spark's <tt>AccumulatorParam</tt> mechanism"
+
+        - name: Keyed State
+          values:
+            - class: model
+              jira: BEAM-25
+              l1: 'No'
+              l2: storage per key, per window
+              l3: Allows fine-grained access to per-key, per-window persistent state. Necessary for certain use cases (e.g. high-volume windows which store large amounts of data, but typically only access small portions of it; complex state machines; etc.) that are not easily or efficiently addressed via <tt>Combine</tt> or <tt>GroupByKey</tt>+<tt>ParDo</tt>. 
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: Dataflow already supports keyed state internally, so adding support for this should be easy once the Beam model exposes it.
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: Flink already supports keyed state, so adding support for this should be easy once the Beam model exposes it.
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: Spark supports keyed state with mapWithState() so support shuold be straight forward.
+              
+              
+    - description: Where in event time?
+      anchor: where
+      color-b: '37d'
+      color-y: '59f'
+      color-p: '8cf'
+      color-n: 'ddd'
+      rows:
+        - name: Global windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: all time
+              l3: The default window which covers all of time. (Basically how traditional batch cases fit in the model.)
+            - class: dataflow
+              l1: 'Yes'
+              l2: default
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: spark
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+              
+        - name: Fixed windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: periodic, non-overlapping
+              l3: Fixed-size, timestamp-based windows. (Hourly, Daily, etc)
+            - class: dataflow
+              l1: 'Yes'
+              l2: built-in
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: spark
+              l1: Partially
+              l2: currently only supported in batch
+              l3: ''
+              
+        - name: Sliding windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: periodic, overlapping
+              l3: Possibly overlapping fixed-size timestamp-based windows (Every minute, use the last ten minutes of data.)
+            - class: dataflow
+              l1: 'Yes'
+              l2: built-in
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+
+        - name: Session windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: activity-based
+              l3: Based on bursts of activity separated by a gap size. Different per key.
+            - class: dataflow
+              l1: 'Yes'
+              l2: built-in
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+        - name: Custom windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined windows
+              l3: All windows must implement <tt>BoundedWindow</tt>, which specifies a max timestamp. Each <tt>WindowFn</tt> assigns elements to an associated window.
+            - class: dataflow
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+        - name: Custom merging windows
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user-defined merging windows
+              l3: A custom <tt>WindowFn</tt> additionally specifies whether and how to merge windows.
+            - class: dataflow
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+        - name: Timestamp control
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: output timestamp for window panes
+              l3: For a grouping transform, such as GBK or Combine, an OutputTimeFn specifies (1) how to combine input timestamps within a window and (2) how to merge aggregated timestamps when windows merge.
+            - class: dataflow
+              l1: 'Yes'
+              l2: supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: pending Spark engine support
+              l3: ''
+
+
+              
+    - description: When in processing time?
+      anchor: when
+      color-b: '6a4'
+      color-y: '8c6'
+      color-p: 'ae8'
+      color-n: 'ddd'
+      rows:
+        
+        - name: Configurable triggering
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: user customizable
+              l3: Triggering may be specified by the user (instead of simply driven by hardcoded defaults).
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Fully supported in streaming mode. In batch mode, intermediate trigger firings are effectively meaningless.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+
+        - name: Event-time triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: relative to event time
+              l3: Triggers that fire in response to event-time completeness signals, such as watermarks progressing.
+            - class: dataflow
+              l1: 'Yes'
+              l2: yes in streaming, fixed granularity in batch
+              l3: Fully supported in streaming mode. In batch mode, currently watermark progress jumps from the beginning of time to the end of time once the input has been fully consumed, thus no additional triggering granularity is available.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: Processing-time triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: relative to processing time
+              l3: Triggers that fire in response to processing-time advancing.
+            - class: dataflow
+              l1: 'Yes'
+              l2: yes in streaming, fixed granularity in batch
+              l3: Fully supported in streaming mode. In batch mode, from the perspective of triggers, processing time currently jumps from the beginning of time to the end of time once the input has been fully consumed, thus no additional triggering granularity is available.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'Yes'
+              l2: "This is Spark streaming's native model"
+              l3: "Spark processes streams in micro-batches. The micro-batch size is actually a pre-set, fixed, time interval. Currently, the runner takes the first window size in the pipeline and sets it's size as the batch interval. Any following window operations will be considered processing time windows and will affect triggering."
+              
+        - name: Count triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: every N elements
+              l3: Triggers that fire after seeing at least N elements.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Fully supported in streaming mode. In batch mode, elements are processed in the largest bundles possible, so count-based triggers are effectively meaningless.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+
+        - name: '[Meta]data driven triggers'
+          values:
+            - class: model
+              jira: BEAM-101
+              l1: 'No'
+              l2: in response to data
+              l3: Triggers that fire in response to attributes of the data being processed.
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: 
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: 
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: 
+
+        - name: Composite triggers
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: compositions of one or more sub-triggers
+              l3: Triggers which compose other triggers in more complex structures, such as logical AND, logical OR, early/on-time/late, etc.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: Allowed lateness
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: event-time bound on window lifetimes
+              l3: A way to bound the useful lifetime of a window (in event time), after which any unemitted results may be materialized, the window contents may be garbage collected, and any addtional late data that arrive for the window may be discarded.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Fully supported in streaming mode. In batch mode no data is ever late.
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: Timers
+          values:
+            - class: model
+              jira: BEAM-27
+              l1: 'No'
+              l2: delayed processing callbacks
+              l3: A fine-grained mechanism for performing work at some point in the future, in either the event-time or processing-time domain. Useful for orchestrating delayed events, timeouts, etc in complex state per-key, per-window state machines.
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: Dataflow already supports timers internally, so adding support for this should be easy once the Beam model exposes it.
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: Flink already supports timers internally, so adding support for this should be easy once the Beam model exposes it.
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+              
+              
+    - description: How do refinements relate?
+      anchor: how
+      color-b: 'b55'
+      color-y: 'd77'
+      color-p: 'faa'
+      color-n: 'ddd'
+      rows:
+        
+        - name: Discarding
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: panes discard elements when fired
+              l3: Elements are discarded from accumulated state as their pane is fired.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: ''
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'Yes'
+              l2: fully supported
+              l3: 'Spark streaming natively discards elements after firing.'
+              
+        - name: Accumulating
+          values:
+            - class: model
+              l1: 'Yes'
+              l2: panes accumulate elements across firings
+              l3: Elements are accumulated in state across multiple pane firings for the same window.
+            - class: dataflow
+              l1: 'Yes'
+              l2: fully supported
+              l3: Requires that the accumulated pane fits in memory, after being passed through the combiner (if relevant)
+            - class: flink
+              l1: 'Yes'
+              l2: fully supported
+              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
+            - class: spark
+              l1: 'No'
+              l2: ''
+              l3: ''
+              
+        - name: 'Accumulating &amp; Retracting'
+          values:
+            - class: model
+              jira: BEAM-91
+              l1: 'No'
+              l2: accumulation plus retraction of old panes
+              l3: Elements are accumulated across multiple pane firings and old emitted values are retracted. Also known as "backsies" ;-D
+            - class: dataflow
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+            - class: flink
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+            - class: spark
+              l1: 'No'
+              l2: pending model support
+              l3: ''
+              
+
+---
+
+With initial code drops complete ([Dataflow SDK and Runner](https://github.com/apache/incubator-beam/pull/1), [Flink Runner](https://github.com/apache/incubator-beam/pull/12), [Spark Runner](https://github.com/apache/incubator-beam/pull/42)) and expressed interest in runner implementations for [Storm](https://issues.apache.org/jira/browse/BEAM-9), [Hadoop](https://issues.apache.org/jira/browse/BEAM-19), and [Gearpump](https://issues.apache.org/jira/browse/BEAM-79) (amongst others), we wanted to start addressing a big question in the Apache Beam (incubating) community: what capabilities will each runner be able to support?
+
+While we’d love to have a world where all runners support the full suite of semantics included in the Beam Model (formerly referred to as the [Dataflow Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), practically speaking, there will always be certain features that some runners can’t provide. For example, a Hadoop-based runner would be inherently batch-based and may be unable to (easily) implement support for unbounded collections. However, that doesn’t prevent it from being extremely useful for a large set of uses. In other cases, the implementations provided by one runner may have slightly different semantics that those provided by another (e.g. even though the current suite of runners all support exactly-once delivery guarantees, an [Apache Samza](http://samza.apache.org/) runner, which would be a welcome addition, would currently only support at-least-once).
+
+To help clarify things, we’ve been working on enumerating the key features of the Beam model in a [capability matrix]({{ site.baseurl }}/capability-matrix/) for all existing runners, categorized around the four key questions addressed by the model: <span class="wwwh-what-dark">What</span> / <span class="wwwh-where-dark">Where</span> / <span class="wwwh-when-dark">When</span> / <span class="wwwh-how-dark">How</span> (if you’re not familiar with those questions, you might want to read through [Streaming 102](http://oreilly.com/ideas/the-world-beyond-batch-streaming-102) for an overview). This table will be maintained over time as the model evolves, our understanding grows, and runners are created or features added.
+
+Included below is a summary snapshot of our current understanding of the capabilities of the existing runners (see the [live version]({{ site.baseurl }}/capability-matrix/) for full details, descriptions, and Jira links); since integration is still under way, the system as whole isn’t yet in a completely stable, usable state. But that should be changing in the near future, and we’ll be updating loud and clear on this blog when the first supported Beam 1.0 release happens.
+
+In the meantime, these tables should help clarify where we expect to be in the very near term, and help guide expectations about what existing runners are capable of, and what features runner implementers will be tackling next.
+
+{% include capability-matrix-common.md %}
+{% assign cap-data=page.capability-matrix-snapshot %}
+
+<!-- Summary table -->
+{% assign cap-style='cap-summary' %}
+{% assign cap-view='blog' %}
+{% assign cap-other-view='full' %}
+{% assign cap-toggle-details=1 %}
+{% assign cap-display='block' %}
+
+{% include capability-matrix.md %}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/04e1fbbb/_posts/2016-03-17-compatability-matrix.md
----------------------------------------------------------------------
diff --git a/_posts/2016-03-17-compatability-matrix.md b/_posts/2016-03-17-compatability-matrix.md
deleted file mode 100644
index df4df31..0000000
--- a/_posts/2016-03-17-compatability-matrix.md
+++ /dev/null
@@ -1,596 +0,0 @@
----
-layout: post
-title:  "Clarifying & Formalizing Runner Capabilities"
-date:   2016-03-17 11:00:00 -0700
-excerpt_separator: <!--more-->
-categories: beam compatibility
-authors:
-  - fjp
-  - takidau
-
-capability-matrix-snapshot:
-  columns:
-    - class: model
-      name: Beam Model
-    - class: dataflow
-      name: Google Cloud Dataflow
-    - class: flink
-      name: Apache Flink
-    - class: spark
-      name: Apache Spark
-  categories:
-    - description: What is being computed?
-      anchor: what
-      color-b: 'ca1'
-      color-y: 'ec3'
-      color-p: 'fe5'
-      color-n: 'ddd'
-      rows:
-        - name: ParDo
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: element-wise processing
-              l3: Element-wise transformation parameterized by a chunk of user code. Elements are processed in bundles, with initialization and termination hooks. Bundle size is chosen by the runner and cannot be controlled by user code. ParDo processes a main input PCollection one element at a time, but provides side input access to additional PCollections.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: Batch mode uses large bundle sizes. Streaming uses smaller bundle sizes.
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: ParDo itself, as per-element transformation with UDFs, is fully supported by Flink for both batch and streaming.
-            - class: spark
-              l1: 'Yes'
-              l2: fully supported
-              l3: ParDo applies per-element transformations as Spark FlatMapFunction.
-        - name: GroupByKey
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: key grouping
-              l3: Grouping of key-value pairs per key, window, and pane. (See also other tabs.)
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "Uses Flink's keyBy for key grouping. When grouping by window in streaming (creating the panes) the Flink runner uses the Beam code. This guarantees support for all windowing and triggering mechanisms."
-            - class: spark
-              l1: 'Partially'
-              l2: group by window in batch only
-              l3: "Uses Spark's groupByKey for grouping. Grouping by window is currently only supported in batch."
-        - name: Flatten
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: collection concatenation
-              l3: Concatenates multiple homogenously typed collections together.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: ''
-            - class: spark
-              l1: 'Yes'
-              l2: fully supported
-              l3: ''
-              
-        - name: Combine
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: associative &amp; commutative aggregation
-              l3: 'Application of an associative, commutative operation over all values ("globally") or over all values associated with each key ("per key"). Can be implemented using ParDo, but often more efficient implementations exist.'
-            - class: dataflow
-              l1: 'Yes'
-              l2: 'efficient execution'
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: 'fully supported'
-              l3: Uses a combiner for pre-aggregation for batch and streaming.
-            - class: spark
-              l1: 'Yes'
-              l2: fully supported
-              l3: Supports GroupedValues, Globally and PerKey.
-
-        - name: Composite Transforms
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: user-defined transformation subgraphs
-              l3: Allows easy extensibility for library writers.  In the near future, we expect there to be more information provided at this level -- customized metadata hooks for monitoring, additional runtime/environment hooks, etc.
-            - class: dataflow
-              l1: 'Partially'
-              l2: supported via inlining
-              l3: Currently composite transformations are inlined during execution. The structure is later recreated from the names, but other transform level information (if added to the model) will be lost.
-            - class: flink
-              l1: 'Partially'
-              l2: supported via inlining
-              l3: ''
-            - class: spark
-              l1: 'Partially'
-              l2: supported via inlining
-              l3: ''
-
-        - name: Side Inputs
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: additional elements available during DoFn execution
-              l3: Side inputs are additional <tt>PCollections</tt> whose contents are computed during pipeline execution and then made accessible to DoFn code. The exact shape of the side input depends both on the <tt>PCollectionView</tt> used to describe the access pattern (interable, map, singleton) and the window of the element from the main input that is currently being processed.
-            - class: dataflow
-              l1: 'Yes'
-              l2: some size restrictions in streaming
-              l3: Batch implemented supports a distributed implementation, but streaming mode may force some size restrictions. Neither mode is able to push lookups directly up into key-based sources.
-            - class: flink
-              jira: BEAM-102
-              l1: 'Partially'
-              l2: no supported in streaming
-              l3: Supported in batch. Side inputs for streaming are currently WiP.
-            - class: spark
-              l1: 'Partially'
-              l2: not supported in streaming
-              l3: "Side input is actually a broadcast variable in Spark so it can't be updated during the life of a job. Spark-runner implementation of side input is more of an immutable, static, side input."
-
-        - name: Source API
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: user-defined sources
-              l3: Allows users to provide additional input sources. Supports both bounded and unbounded data. Includes hooks necessary to provide efficient parallelization (size estimation, progress information, dynamic splitting, etc).
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: 
-            - class: flink
-              jira: BEAM-103
-              l1: 'Partially'
-              l2: parallelism 1 in streaming
-              l3: Fully supported in batch. In streaming, sources currently run with parallelism 1.
-            - class: spark
-              l1: 'Yes'
-              l2: fully supported
-              l3: 
-              
-        - name: Aggregators
-          values:
-            - class: model
-              l1: 'Partially'
-              l2: user-provided metrics
-              l3: Allow transforms to aggregate simple metrics across bundles in a <tt>DoFn</tt>. Semantically equivalent to using a side output, but support partial results as the transform executes. Will likely want to augment <tt>Aggregators</tt> to be more useful for processing unbounded data by making them windowed.
-            - class: dataflow
-              l1: 'Partially'
-              l2: may miscount in streaming mode
-              l3: Current model is fully supported in batch mode. In streaming mode, <tt>Aggregators</tt> may under or overcount when bundles are retried.
-            - class: flink
-              l1: 'Partially'
-              l2: may undercount in streaming
-              l3: Current model is fully supported in batch. In streaming mode, <tt>Aggregators</tt> may undercount.
-            - class: spark
-              l1: 'Partially'
-              l2: streaming requires more testing
-              l3: "Uses Spark's <tt>AccumulatorParam</tt> mechanism"
-
-        - name: Keyed State
-          values:
-            - class: model
-              jira: BEAM-25
-              l1: 'No'
-              l2: storage per key, per window
-              l3: Allows fine-grained access to per-key, per-window persistent state. Necessary for certain use cases (e.g. high-volume windows which store large amounts of data, but typically only access small portions of it; complex state machines; etc.) that are not easily or efficiently addressed via <tt>Combine</tt> or <tt>GroupByKey</tt>+<tt>ParDo</tt>. 
-            - class: dataflow
-              l1: 'No'
-              l2: pending model support
-              l3: Dataflow already supports keyed state internally, so adding support for this should be easy once the Beam model exposes it.
-            - class: flink
-              l1: 'No'
-              l2: pending model support
-              l3: Flink already supports keyed state, so adding support for this should be easy once the Beam model exposes it.
-            - class: spark
-              l1: 'No'
-              l2: pending model support
-              l3: Spark supports keyed state with mapWithState() so support shuold be straight forward.
-              
-              
-    - description: Where in event time?
-      anchor: where
-      color-b: '37d'
-      color-y: '59f'
-      color-p: '8cf'
-      color-n: 'ddd'
-      rows:
-        - name: Global windows
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: all time
-              l3: The default window which covers all of time. (Basically how traditional batch cases fit in the model.)
-            - class: dataflow
-              l1: 'Yes'
-              l2: default
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-            - class: spark
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-              
-        - name: Fixed windows
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: periodic, non-overlapping
-              l3: Fixed-size, timestamp-based windows. (Hourly, Daily, etc)
-            - class: dataflow
-              l1: 'Yes'
-              l2: built-in
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-            - class: spark
-              l1: Partially
-              l2: currently only supported in batch
-              l3: ''
-              
-        - name: Sliding windows
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: periodic, overlapping
-              l3: Possibly overlapping fixed-size timestamp-based windows (Every minute, use the last ten minutes of data.)
-            - class: dataflow
-              l1: 'Yes'
-              l2: built-in
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-
-        - name: Session windows
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: activity-based
-              l3: Based on bursts of activity separated by a gap size. Different per key.
-            - class: dataflow
-              l1: 'Yes'
-              l2: built-in
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: pending Spark engine support
-              l3: ''
-
-        - name: Custom windows
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: user-defined windows
-              l3: All windows must implement <tt>BoundedWindow</tt>, which specifies a max timestamp. Each <tt>WindowFn</tt> assigns elements to an associated window.
-            - class: dataflow
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: pending Spark engine support
-              l3: ''
-
-        - name: Custom merging windows
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: user-defined merging windows
-              l3: A custom <tt>WindowFn</tt> additionally specifies whether and how to merge windows.
-            - class: dataflow
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: pending Spark engine support
-              l3: ''
-
-        - name: Timestamp control
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: output timestamp for window panes
-              l3: For a grouping transform, such as GBK or Combine, an OutputTimeFn specifies (1) how to combine input timestamps within a window and (2) how to merge aggregated timestamps when windows merge.
-            - class: dataflow
-              l1: 'Yes'
-              l2: supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: pending Spark engine support
-              l3: ''
-
-
-              
-    - description: When in processing time?
-      anchor: when
-      color-b: '6a4'
-      color-y: '8c6'
-      color-p: 'ae8'
-      color-n: 'ddd'
-      rows:
-        
-        - name: Configurable triggering
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: user customizable
-              l3: Triggering may be specified by the user (instead of simply driven by hardcoded defaults).
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: Fully supported in streaming mode. In batch mode, intermediate trigger firings are effectively meaningless.
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-
-        - name: Event-time triggers
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: relative to event time
-              l3: Triggers that fire in response to event-time completeness signals, such as watermarks progressing.
-            - class: dataflow
-              l1: 'Yes'
-              l2: yes in streaming, fixed granularity in batch
-              l3: Fully supported in streaming mode. In batch mode, currently watermark progress jumps from the beginning of time to the end of time once the input has been fully consumed, thus no additional triggering granularity is available.
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-              
-        - name: Processing-time triggers
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: relative to processing time
-              l3: Triggers that fire in response to processing-time advancing.
-            - class: dataflow
-              l1: 'Yes'
-              l2: yes in streaming, fixed granularity in batch
-              l3: Fully supported in streaming mode. In batch mode, from the perspective of triggers, processing time currently jumps from the beginning of time to the end of time once the input has been fully consumed, thus no additional triggering granularity is available.
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'Yes'
-              l2: "This is Spark streaming's native model"
-              l3: "Spark processes streams in micro-batches. The micro-batch size is actually a pre-set, fixed, time interval. Currently, the runner takes the first window size in the pipeline and sets it's size as the batch interval. Any following window operations will be considered processing time windows and will affect triggering."
-              
-        - name: Count triggers
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: every N elements
-              l3: Triggers that fire after seeing at least N elements.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: Fully supported in streaming mode. In batch mode, elements are processed in the largest bundles possible, so count-based triggers are effectively meaningless.
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-
-        - name: '[Meta]data driven triggers'
-          values:
-            - class: model
-              jira: BEAM-101
-              l1: 'No'
-              l2: in response to data
-              l3: Triggers that fire in response to attributes of the data being processed.
-            - class: dataflow
-              l1: 'No'
-              l2: pending model support
-              l3: 
-            - class: flink
-              l1: 'No'
-              l2: pending model support
-              l3: 
-            - class: spark
-              l1: 'No'
-              l2: pending model support
-              l3: 
-
-        - name: Composite triggers
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: compositions of one or more sub-triggers
-              l3: Triggers which compose other triggers in more complex structures, such as logical AND, logical OR, early/on-time/late, etc.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-              
-        - name: Allowed lateness
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: event-time bound on window lifetimes
-              l3: A way to bound the useful lifetime of a window (in event time), after which any unemitted results may be materialized, the window contents may be garbage collected, and any addtional late data that arrive for the window may be discarded.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: Fully supported in streaming mode. In batch mode no data is ever late.
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-              
-        - name: Timers
-          values:
-            - class: model
-              jira: BEAM-27
-              l1: 'No'
-              l2: delayed processing callbacks
-              l3: A fine-grained mechanism for performing work at some point in the future, in either the event-time or processing-time domain. Useful for orchestrating delayed events, timeouts, etc in complex state per-key, per-window state machines.
-            - class: dataflow
-              l1: 'No'
-              l2: pending model support
-              l3: Dataflow already supports timers internally, so adding support for this should be easy once the Beam model exposes it.
-            - class: flink
-              l1: 'No'
-              l2: pending model support
-              l3: Flink already supports timers internally, so adding support for this should be easy once the Beam model exposes it.
-            - class: spark
-              l1: 'No'
-              l2: pending model support
-              l3: ''
-              
-              
-    - description: How do refinements relate?
-      anchor: how
-      color-b: 'b55'
-      color-y: 'd77'
-      color-p: 'faa'
-      color-n: 'ddd'
-      rows:
-        
-        - name: Discarding
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: panes discard elements when fired
-              l3: Elements are discarded from accumulated state as their pane is fired.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: ''
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'Yes'
-              l2: fully supported
-              l3: 'Spark streaming natively discards elements after firing.'
-              
-        - name: Accumulating
-          values:
-            - class: model
-              l1: 'Yes'
-              l2: panes accumulate elements across firings
-              l3: Elements are accumulated in state across multiple pane firings for the same window.
-            - class: dataflow
-              l1: 'Yes'
-              l2: fully supported
-              l3: Requires that the accumulated pane fits in memory, after being passed through the combiner (if relevant)
-            - class: flink
-              l1: 'Yes'
-              l2: fully supported
-              l3: "The Runner uses Beam's Windowing and Triggering logic and code."
-            - class: spark
-              l1: 'No'
-              l2: ''
-              l3: ''
-              
-        - name: 'Accumulating &amp; Retracting'
-          values:
-            - class: model
-              jira: BEAM-91
-              l1: 'No'
-              l2: accumulation plus retraction of old panes
-              l3: Elements are accumulated across multiple pane firings and old emitted values are retracted. Also known as "backsies" ;-D
-            - class: dataflow
-              l1: 'No'
-              l2: pending model support
-              l3: ''
-            - class: flink
-              l1: 'No'
-              l2: pending model support
-              l3: ''
-            - class: spark
-              l1: 'No'
-              l2: pending model support
-              l3: ''
-              
-
----
-
-With initial code drops complete ([Dataflow SDK and Runner](https://github.com/apache/incubator-beam/pull/1), [Flink Runner](https://github.com/apache/incubator-beam/pull/12), [Spark Runner](https://github.com/apache/incubator-beam/pull/42)) and expressed interest in runner implementations for [Storm](https://issues.apache.org/jira/browse/BEAM-9), [Hadoop](https://issues.apache.org/jira/browse/BEAM-19), and [Gearpump](https://issues.apache.org/jira/browse/BEAM-79) (amongst others), we wanted to start addressing a big question in the Apache Beam (incubating) community: what capabilities will each runner be able to support?
-
-While we’d love to have a world where all runners support the full suite of semantics included in the Beam Model (formerly referred to as the [Dataflow Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), practically speaking, there will always be certain features that some runners can’t provide. For example, a Hadoop-based runner would be inherently batch-based and may be unable to (easily) implement support for unbounded collections. However, that doesn’t prevent it from being extremely useful for a large set of uses. In other cases, the implementations provided by one runner may have slightly different semantics that those provided by another (e.g. even though the current suite of runners all support exactly-once delivery guarantees, an [Apache Samza](http://samza.apache.org/) runner, which would be a welcome addition, would currently only support at-least-once).
-
-To help clarify things, we’ve been working on enumerating the key features of the Beam model in a [capability matrix]({{ site.baseurl }}/capability-matrix/) for all existing runners, categorized around the four key questions addressed by the model: <span class="wwwh-what-dark">What</span> / <span class="wwwh-where-dark">Where</span> / <span class="wwwh-when-dark">When</span> / <span class="wwwh-how-dark">How</span> (if you’re not familiar with those questions, you might want to read through [Streaming 102](http://oreilly.com/ideas/the-world-beyond-batch-streaming-102) for an overview). This table will be maintained over time as the model evolves, our understanding grows, and runners are created or features added.
-
-Included below is a summary snapshot of our current understanding of the capabilities of the existing runners (see the [live version]({{ site.baseurl }}/capability-matrix/) for full details, descriptions, and Jira links); since integration is still under way, the system as whole isn’t yet in a completely stable, usable state. But that should be changing in the near future, and we’ll be updating loud and clear on this blog when the first supported Beam 1.0 release happens.
-
-In the meantime, these tables should help clarify where we expect to be in the very near term, and help guide expectations about what existing runners are capable of, and what features runner implementers will be tackling next.
-
-{% include capability-matrix-common.md %}
-{% assign cap-data=page.capability-matrix-snapshot %}
-
-<!-- Summary table -->
-{% assign cap-style='cap-summary' %}
-{% assign cap-view='blog' %}
-{% assign cap-other-view='full' %}
-{% assign cap-toggle-details=1 %}
-{% assign cap-display='block' %}
-
-{% include capability-matrix.md %}

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/04e1fbbb/content/beam/capability/2016/03/17/capability-matrix.html
----------------------------------------------------------------------
diff --git a/content/beam/capability/2016/03/17/capability-matrix.html b/content/beam/capability/2016/03/17/capability-matrix.html
new file mode 100644
index 0000000..fdefa53
--- /dev/null
+++ b/content/beam/capability/2016/03/17/capability-matrix.html
@@ -0,0 +1,896 @@
+<!DOCTYPE html>
+<html lang="en">
+
+  <head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1">
+
+  <title>Clarifying &amp; Formalizing Runner Capabilities</title>
+  <meta name="description" content="With initial code drops complete (Dataflow SDK and Runner, Flink Runner, Spark Runner) and expressed interest in runner implementations for Storm, Hadoop, an...">
+
+  <link rel="stylesheet" href="/styles/site.css">
+  <link rel="stylesheet" href="/css/theme.css">
+  <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"></script>
+  <script src="/js/bootstrap.min.js"></script>
+  <link rel="canonical" href="http://beam.incubator.apache.org/beam/capability/2016/03/17/capability-matrix.html">
+  <link rel="alternate" type="application/rss+xml" title="Apache Beam (incubating)" href="http://beam.incubator.apache.org/feed.xml">
+  <script>
+    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+    })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+    ga('create', 'UA-73650088-1', 'auto');
+    ga('send', 'pageview');
+
+  </script>
+  <link rel="shortcut icon" type="image/x-icon" href="/images/favicon.ico">
+</head>
+
+
+  <body role="document">
+
+    <nav class="navbar navbar-default navbar-fixed-top">
+  <div class="container">
+    <div class="navbar-header">
+      <a href="/" class="navbar-brand" >
+        <img alt="Brand" src="/images/beam_logo_navbar.png">
+      </a>
+    </div>
+    <div id="navbar" class="navbar-collapse collapse">
+      <ul class="nav navbar-nav">
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Documentation <span class="caret"></span></a>
+          <ul class="dropdown-menu">
+            <li><a href="/getting_started/">Getting Started</a></li>
+	    <li><a href="/capability-matrix/">Capability Matrix</a></li>
+            <li><a href="https://goo.gl/ps8twC">Technical Docs</a></li>
+            <li><a href="https://goo.gl/nk5OM0">Technical Vision</a></li>
+          </ul>
+        </li>
+        <li class="dropdown">
+          <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-haspopup="true" aria-expanded="false">Community <span class="caret"></span></a>
+          <ul class="dropdown-menu">
+            <li class="dropdown-header">Community</li>
+            <li><a href="/mailing_lists/">Mailing Lists</a></li>
+            <li><a href="https://goo.gl/ps8twC">Technical Docs</a></li>
+            <li><a href="https://goo.gl/nk5OM0">Technical Vision</a></li>
+            <li><a href="/team/">Apache Beam Team</a></li>
+            <li role="separator" class="divider"></li>
+            <li class="dropdown-header">Contribute</li>
+            <li><a href="/source_repository/">Source Repository</a></li>
+            <li><a href="/issue_tracking/">Issue Tracking</a></li>
+          </ul>
+        </li>
+        <li><a href="/blog">Blog</a></li>
+      </ul>
+    </div><!--/.nav-collapse -->
+  </div>
+</nav>
+
+
+<link rel="stylesheet" href="">
+
+
+    <div class="container" role="main">
+
+      <div class="container">
+        
+
+<article class="post" itemscope itemtype="http://schema.org/BlogPosting">
+
+  <header class="post-header">
+    <h1 class="post-title" itemprop="name headline">Clarifying & Formalizing Runner Capabilities</h1>
+    <p class="post-meta"><time datetime="2016-03-17T11:00:00-07:00" itemprop="datePublished">Mar 17, 2016</time> •  Frances Perry [<a href="https://twitter.com/francesjperry">@francesjperry</a>] &amp; Tyler Akidau [<a href="https://twitter.com/takidau">@takidau</a>]
+</p>
+  </header>
+
+  <div class="post-content" itemprop="articleBody">
+    <p>With initial code drops complete (<a href="https://github.com/apache/incubator-beam/pull/1">Dataflow SDK and Runner</a>, <a href="https://github.com/apache/incubator-beam/pull/12">Flink Runner</a>, <a href="https://github.com/apache/incubator-beam/pull/42">Spark Runner</a>) and expressed interest in runner implementations for <a href="https://issues.apache.org/jira/browse/BEAM-9">Storm</a>, <a href="https://issues.apache.org/jira/browse/BEAM-19">Hadoop</a>, and <a href="https://issues.apache.org/jira/browse/BEAM-79">Gearpump</a> (amongst others), we wanted to start addressing a big question in the Apache Beam (incubating) community: what capabilities will each runner be able to support?</p>
+
+<p>While we’d love to have a world where all runners support the full suite of semantics included in the Beam Model (formerly referred to as the <a href="http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf">Dataflow Model</a>), practically speaking, there will always be certain features that some runners can’t provide. For example, a Hadoop-based runner would be inherently batch-based and may be unable to (easily) implement support for unbounded collections. However, that doesn’t prevent it from being extremely useful for a large set of uses. In other cases, the implementations provided by one runner may have slightly different semantics that those provided by another (e.g. even though the current suite of runners all support exactly-once delivery guarantees, an <a href="http://samza.apache.org/">Apache Samza</a> runner, which would be a welcome addition, would currently only support at-least-once).</p>
+
+<p>To help clarify things, we’ve been working on enumerating the key features of the Beam model in a <a href="/capability-matrix/">capability matrix</a> for all existing runners, categorized around the four key questions addressed by the model: <span class="wwwh-what-dark">What</span> / <span class="wwwh-where-dark">Where</span> / <span class="wwwh-when-dark">When</span> / <span class="wwwh-how-dark">How</span> (if you’re not familiar with those questions, you might want to read through <a href="http://oreilly.com/ideas/the-world-beyond-batch-streaming-102">Streaming 102</a> for an overview). This table will be maintained over time as the model evolves, our understanding grows, and runners are created or features added.</p>
+
+<p>Included below is a summary snapshot of our current understanding of the capabilities of the existing runners (see the <a href="/capability-matrix/">live version</a> for full details, descriptions, and Jira links); since integration is still under way, the system as whole isn’t yet in a completely stable, usable state. But that should be changing in the near future, and we’ll be updating loud and clear on this blog when the first supported Beam 1.0 release happens.</p>
+
+<p>In the meantime, these tables should help clarify where we expect to be in the very near term, and help guide expectations about what existing runners are capable of, and what features runner implementers will be tackling next.</p>
+
+<script type="text/javascript">
+  function ToggleTables(showDetails, anchor) {
+    document.getElementById("cap-summary").style.display = showDetails ? "none" : "block";
+    document.getElementById("cap-full").style.display = showDetails ? "block" : "none";
+    location.hash = anchor;
+  }
+</script>
+
+<!-- Summary table -->
+
+<div id="cap-blog" style="display:block">
+<table class="cap-summary">
+  
+  <tr class="cap-summary" id="cap-blog-what">
+    <th class="cap-summary color-metadata format-category" colspan="5" style="color:#ca1">What is being computed?</th>
+  </tr>
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability"></th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#ec3">Beam Model</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#ec3">Google Cloud Dataflow</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#ec3">Apache Flink</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#ec3">Apache Spark</th>
+  
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">ParDo</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">GroupByKey</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Flatten</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Combine</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Composite Transforms</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Side Inputs</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Source API</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ec3;border-color:#ca1"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Aggregators</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#fe5;border-color:#ca1"><b><center>~</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#ec3">Keyed State</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#ca1"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#ca1"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#ca1"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#ca1"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <td class="cap-summary color-blank cap-blank" colspan="5"></td>
+  </tr>
+  
+  <tr class="cap-summary" id="cap-blog-where">
+    <th class="cap-summary color-metadata format-category" colspan="5" style="color:#37d">Where in event time?</th>
+  </tr>
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability"></th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#59f">Beam Model</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#59f">Google Cloud Dataflow</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#59f">Apache Flink</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#59f">Apache Spark</th>
+  
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Global windows</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Fixed windows</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8cf;border-color:#37d"><b><center>~</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Sliding windows</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#37d"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Session windows</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#37d"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Custom windows</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#37d"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Custom merging windows</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#37d"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#59f">Timestamp control</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#59f;border-color:#37d"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#37d"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <td class="cap-summary color-blank cap-blank" colspan="5"></td>
+  </tr>
+  
+  <tr class="cap-summary" id="cap-blog-when">
+    <th class="cap-summary color-metadata format-category" colspan="5" style="color:#6a4">When in processing time?</th>
+  </tr>
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability"></th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#8c6">Beam Model</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#8c6">Google Cloud Dataflow</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#8c6">Apache Flink</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#8c6">Apache Spark</th>
+  
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Configurable triggering</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Event-time triggers</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Processing-time triggers</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Count triggers</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">[Meta]data driven triggers</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Composite triggers</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Allowed lateness</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#8c6;border-color:#6a4"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#8c6">Timers</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#6a4"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <td class="cap-summary color-blank cap-blank" colspan="5"></td>
+  </tr>
+  
+  <tr class="cap-summary" id="cap-blog-how">
+    <th class="cap-summary color-metadata format-category" colspan="5" style="color:#b55">How do refinements relate?</th>
+  </tr>
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability"></th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#d77">Beam Model</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#d77">Google Cloud Dataflow</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#d77">Apache Flink</th>
+  
+    <th class="cap-summary color-platform format-platform" style="color:#d77">Apache Spark</th>
+  
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#d77">Discarding</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#d77">Accumulating</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#d77;border-color:#b55"><b><center>&#x2713;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#b55"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <th class="cap-summary color-capability format-capability" style="color:#d77">Accumulating &amp; Retracting</th>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#b55"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#b55"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#b55"><b><center>&#x2715;</center></b>
+</td>
+    
+    
+
+    <td width="25%" class="cap-summary" style="background-color:#ddd;border-color:#b55"><b><center>&#x2715;</center></b>
+</td>
+    
+  </tr>
+  
+  <tr class="cap-summary">
+    <td class="cap-summary color-blank cap-blank" colspan="5"></td>
+  </tr>
+  
+</table>
+</div>
+
+
+  </div>
+
+</article>
+
+      </div>
+
+
+    <hr>
+  <div class="row">
+      <div class="col-xs-12">
+          <footer>
+              <p class="text-center">&copy; Copyright 2016
+                <a href="http://www.apache.org">The Apache Software Foundation.</a> All Rights Reserved.</p>
+                <p class="text-center"><a href="/privacy_policy">Privacy Policy</a> |
+                <a href="/feed.xml">RSS Feed</a></p>
+          </footer>
+      </div>
+  </div>
+  <!-- container div end -->
+</div>
+
+
+  </body>
+
+</html>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/04e1fbbb/content/blog/index.html
----------------------------------------------------------------------
diff --git a/content/blog/index.html b/content/blog/index.html
index 66fa6fe..bab6e9b 100644
--- a/content/blog/index.html
+++ b/content/blog/index.html
@@ -87,7 +87,7 @@
     <p>This is the blog for the Apache Beam project. This blog contains news and updates
 for the project.</p>
 
-<h3 id="a-classpost-link-hrefbeamcompatibility20160317compatability-matrixhtmlclarifying--formalizing-runner-capabilitiesa"><a class="post-link" href="/beam/compatibility/2016/03/17/compatability-matrix.html">Clarifying &amp; Formalizing Runner Capabilities</a></h3>
+<h3 id="a-classpost-link-hrefbeamcapability20160317capability-matrixhtmlclarifying--formalizing-runner-capabilitiesa"><a class="post-link" href="/beam/capability/2016/03/17/capability-matrix.html">Clarifying &amp; Formalizing Runner Capabilities</a></h3>
 <p><i>Mar 17, 2016 •  Frances Perry [<a href="https://twitter.com/francesjperry">@francesjperry</a>] &amp; Tyler Akidau [<a href="https://twitter.com/takidau">@takidau</a>]
 </i></p>
 



Mime
View raw message