beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From da...@apache.org
Subject [1/3] incubator-beam-site git commit: BEAM-845 Update navigation and runner capability matrix to include Apex.
Date Mon, 07 Nov 2016 18:21:04 GMT
Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site 2473d849a -> a94ad4021


BEAM-845 Update navigation and runner capability matrix to include Apex.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/c54a9dfb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/c54a9dfb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/c54a9dfb

Branch: refs/heads/asf-site
Commit: c54a9dfb6abcf864528f174b73e5c574b130b5f0
Parents: 2473d84
Author: Thomas Weise <thw@apache.org>
Authored: Sun Nov 6 11:48:47 2016 -0800
Committer: Davor Bonaci <davor@google.com>
Committed: Mon Nov 7 10:19:41 2016 -0800

----------------------------------------------------------------------
 src/_data/capability-matrix.yml                | 116 +++++++++++++++++++-
 src/_includes/header.html                      |   1 +
 src/documentation/index.md                     |   2 +-
 src/documentation/runners/apex.md              |   9 ++
 src/documentation/runners/capability-matrix.md |   2 +-
 5 files changed, 122 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/_data/capability-matrix.yml
----------------------------------------------------------------------
diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml
index c61b68b..375fdf4 100644
--- a/src/_data/capability-matrix.yml
+++ b/src/_data/capability-matrix.yml
@@ -7,6 +7,8 @@ columns:
     name: Apache Flink
   - class: spark
     name: Apache Spark
+  - class: apex
+    name: Apache Apex (on feature branch)
 
 categories:
   - description: What is being computed?
@@ -34,6 +36,10 @@ categories:
             l1: 'Yes'
             l2: fully supported
             l3: ParDo applies per-element transformations as Spark FlatMapFunction.
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: Supported through Apex operator that wraps the function and processes data
as single element bundles.
       - name: GroupByKey
         values:
           - class: model
@@ -52,6 +58,10 @@ categories:
             l1: 'Partially'
             l2: fully supported in batch mode
             l3: "Using Spark's <tt>groupByKey</tt>. GroupByKey with multiple
trigger firings in streaming mode is a work in progress." 
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: "Apex runner uses the Beam code for grouping by window and thereby has support
for all windowing and triggering mechanisms. Runner does not implement partitioning yet (BEAM-838)"
       - name: Flatten
         values:
           - class: model
@@ -70,7 +80,10 @@ categories:
             l1: 'Yes'
             l2: fully supported
             l3: ''
-
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
       - name: Combine
         values:
           - class: model
@@ -89,7 +102,10 @@ categories:
             l1: 'Yes'
             l2: fully supported
             l3: "Using Spark's <tt>combineByKey</tt> and <tt>aggregate</tt>
functions."
-
+          - class: apex
+            l1: 'Yes'
+            l2: 'fully supported'
+            l3: "Default Beam translation. Currently no efficient pre-aggregation (BEAM-935)."
       - name: Composite Transforms
         values:
           - class: model
@@ -108,7 +124,10 @@ categories:
             l1: 'Partially'
             l2: supported via inlining
             l3: ''
-
+          - class: apex
+            l1: 'Partially'
+            l2: supported via inlining
+            l3: ''
       - name: Side Inputs
         values:
           - class: model
@@ -127,7 +146,10 @@ categories:
             l1: 'Yes'
             l2: fully supported
             l3: "Using Spark's broadcast variables. In streaming mode, side inputs may update
but only between micro-batches."
-
+          - class: apex
+            l1: 'Yes'
+            l2: size restrictions
+            l3: No distributed implementation and therefore size restrictions. 
       - name: Source API
         values:
           - class: model
@@ -146,6 +168,10 @@ categories:
             l1: 'Yes'
             l2: fully supported
             l3: 
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: 
 
       - name: Aggregators
         values:
@@ -165,6 +191,10 @@ categories:
             l1: 'Partially'
             l2: may overcount when tasks are retried in transformations.
             l3: 'supported via <tt>AccumulatorParam</tt> mechanism. If a task
retries, and the accumulator is not within a Spark "Action", an overcount is possible.'
+          - class: apex
+            l1: 'No'
+            l2: Not implemented in runner.
+            l3: 
 
       - name: Keyed State
         values:
@@ -185,7 +215,10 @@ categories:
             l1: 'No'
             l2: pending model support
             l3: Spark supports keyed state with mapWithState() so support shuold be straight
forward.
-
+          - class: apex
+            l1: 'No'
+            l2: pending model support
+            l3: Apex supports keyed state, so adding support for this should be easy once
the Beam model exposes it.
 
   - description: Where in event time?
     anchor: where
@@ -212,6 +245,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
       - name: Fixed windows
         values:
@@ -231,6 +268,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
       - name: Sliding windows
         values:
@@ -250,6 +291,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
       - name: Session windows
         values:
@@ -269,6 +314,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
       - name: Custom windows
         values:
@@ -288,6 +337,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
       - name: Custom merging windows
         values:
@@ -307,6 +360,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
       - name: Timestamp control
         values:
@@ -326,6 +383,10 @@ categories:
             l1: 'Yes'
             l2: supported
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: supported
+            l3: ''
 
 
 
@@ -355,6 +416,10 @@ categories:
             l1: 'No'
             l2: ''
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: Event-time triggers
         values:
@@ -374,6 +439,10 @@ categories:
             l1: 'No'
             l2: ''
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: Processing-time triggers
         values:
@@ -393,6 +462,10 @@ categories:
             l1: 'Yes'
             l2: "This is Spark streaming's native model"
             l3: "Spark processes streams in micro-batches. The micro-batch size is actually
a pre-set, fixed, time interval. Currently, the runner takes the first window size in the
pipeline and sets it's size as the batch interval. Any following window operations will be
considered processing time windows and will affect triggering."
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: Count triggers
         values:
@@ -412,6 +485,10 @@ categories:
             l1: 'No'
             l2: ''
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: '[Meta]data driven triggers'
         values:
@@ -432,6 +509,10 @@ categories:
             l1: 'No'
             l2: pending model support
             l3: 
+          - class: apex
+            l1: 'No'
+            l2: pending model support
+            l3: 
 
       - name: Composite triggers
         values:
@@ -451,6 +532,10 @@ categories:
             l1: 'No'
             l2: ''
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: Allowed lateness
         values:
@@ -470,6 +555,10 @@ categories:
             l1: 'No'
             l2: ''
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: Timers
         values:
@@ -490,7 +579,10 @@ categories:
             l1: 'No'
             l2: pending model support
             l3: ''
-
+          - class: apex
+            l1: 'No'
+            l2: pending model support
+            l3: ''
 
   - description: How do refinements relate?
     anchor: how
@@ -518,6 +610,10 @@ categories:
             l1: 'Yes'
             l2: fully supported
             l3: 'Spark streaming natively discards elements after firing.'
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: ''
 
       - name: Accumulating
         values:
@@ -537,6 +633,10 @@ categories:
             l1: 'No'
             l2: ''
             l3: ''
+          - class: apex
+            l1: 'Yes'
+            l2: fully supported
+            l3: 'Size restriction, see combine support.'
 
       - name: 'Accumulating &amp; Retracting'
         values:
@@ -557,3 +657,7 @@ categories:
             l1: 'No'
             l2: pending model support
             l3: ''
+          - class: apex
+            l1: 'No'
+            l2: pending model support
+            l3: ''

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/_includes/header.html
----------------------------------------------------------------------
diff --git a/src/_includes/header.html b/src/_includes/header.html
index f70bcee..e39e9d1 100644
--- a/src/_includes/header.html
+++ b/src/_includes/header.html
@@ -50,6 +50,7 @@
 			  <li class="dropdown-header">Runners</li>
 			  <li><a href="{{ site.baseurl }}/documentation/runners/capability-matrix/">Capability
Matrix</a></li>
 			  <li><a href="{{ site.baseurl }}/documentation/runners/direct/">Direct Runner</a></li>
+			  <li><a href="{{ site.baseurl }}/documentation/runners/apex/">Apache Apex
Runner</a></li>
 			  <li><a href="{{ site.baseurl }}/documentation/runners/flink/">Apache Flink
Runner</a></li>
 			  <li><a href="{{ site.baseurl }}/documentation/runners/spark/">Apache Spark
Runner</a></li>
 			  <li><a href="{{ site.baseurl }}/documentation/runners/dataflow/">Cloud Dataflow
Runner</a></li>

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/documentation/index.md
----------------------------------------------------------------------
diff --git a/src/documentation/index.md b/src/documentation/index.md
index 2ed18f3..c4bbd83 100644
--- a/src/documentation/index.md
+++ b/src/documentation/index.md
@@ -32,10 +32,10 @@ A Beam Runner runs a Beam pipeline on a specific (often distributed) data
proces
 ### Available Runners
 
 * [DirectRunner]({{ site.baseurl }}/documentation/runners/direct/): Runs locally on your
machine -- great for developing, testing, and debugging.
+* [ApexRunner]({{ site.baseurl }}/documentation/runners/apex/): Runs on [Apache Apex](http://apex.apache.org).
 * [FlinkRunner]({{ site.baseurl }}/documentation/runners/flink/): Runs on [Apache Flink](http://flink.apache.org).
 * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on [Apache Spark](http://spark.apache.org).
 * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs on [Google Cloud
Dataflow](https://cloud.google.com/dataflow), a fully managed service within [Google Cloud
Platform](https://cloud.google.com/).
-* _[Under Development]_ [ApexRunner]({{ site.baseurl }}/contribute/work-in-progress/#feature-branches):
Runs on [Apache Apex](http://apex.apache.org).
 * _[Under Development]_ [GearpumpRunner]({{ site.baseurl }}/contribute/work-in-progress/#feature-branches):
Runs on [Apache Gearpump (incubating)](http://gearpump.apache.org). 
 
 ### Choosing a Runner

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/documentation/runners/apex.md
----------------------------------------------------------------------
diff --git a/src/documentation/runners/apex.md b/src/documentation/runners/apex.md
new file mode 100644
index 0000000..408e6de
--- /dev/null
+++ b/src/documentation/runners/apex.md
@@ -0,0 +1,9 @@
+---
+layout: default
+title: "Apache Apex Runner"
+permalink: /documentation/runners/apex/
+---
+# Using the Apache Apex Runner
+
+This page is under construction ([BEAM-825](https://issues.apache.org/jira/browse/BEAM-825)).
The runner is on a feature branch.
+

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/c54a9dfb/src/documentation/runners/capability-matrix.md
----------------------------------------------------------------------
diff --git a/src/documentation/runners/capability-matrix.md b/src/documentation/runners/capability-matrix.md
index 22c602c..bfb8cc1 100644
--- a/src/documentation/runners/capability-matrix.md
+++ b/src/documentation/runners/capability-matrix.md
@@ -8,7 +8,7 @@ redirect_from:
 ---
 
 # Beam Capability Matrix
-Apache Beam (incubating) provides a portable API layer for building sophisticated data-parallel
processing engines that may be executed across a diversity of exeuction engines, or <i>runners</i>.
The core concepts of this layer are based upon the Beam Model (formerly referred to as the
[Dataflow Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), and implemented to varying
degrees in each Beam runner. To help clarify the capabilities of individual runners, we've
created the capability matrix below.
+Apache Beam (incubating) provides a portable API layer for building sophisticated data-parallel
processing pipelines that may be executed across a diversity of execution engines, or <i>runners</i>.
The core concepts of this layer are based upon the Beam Model (formerly referred to as the
[Dataflow Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), and implemented to varying
degrees in each Beam runner. To help clarify the capabilities of individual runners, we've
created the capability matrix below.
 
 Individual capabilities have been grouped by their corresponding <span class="wwwh-what-dark">What</span>
/ <span class="wwwh-where-dark">Where</span> / <span class="wwwh-when-dark">When</span>
/ <span class="wwwh-how-dark">How</span> question:
 


Mime
View raw message