beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [1/3] incubator-beam-site git commit: [BEAM-505] Fill in the documentation/runners/direct portion of the website
Date Tue, 15 Nov 2016 00:35:29 GMT
Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site 6ab73c79a -> a82a0f3bb

[BEAM-505] Fill in the documentation/runners/direct portion of the website


Branch: refs/heads/asf-site
Commit: fe87fb807a310fe8e68298d4fe6fde86d7c65522
Parents: 6ab73c7
Author: melissa <>
Authored: Fri Nov 11 10:44:13 2016 -0800
Committer: Davor Bonaci <>
Committed: Mon Nov 14 16:34:51 2016 -0800

 src/documentation/runners/ | 40 ++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)
diff --git a/src/documentation/runners/ b/src/documentation/runners/
index 094d44e..1d7470d 100644
--- a/src/documentation/runners/
+++ b/src/documentation/runners/
@@ -1,9 +1,45 @@
 layout: default
-title: "Apache Direct Runner"
+title: "Direct Runner"
 permalink: /documentation/runners/direct/
 redirect_from: /learn/runners/direct/
 # Using the Direct Runner
-This page is under construction ([BEAM-505](
+The Direct Runner executes pipelines on your machine and is designed to validate that pipelines
adhere to the Apache Beam model as closely as possible. Instead of focusing on efficient pipeline
execution, the Direct Runner performs additional checks to ensure that users do not rely on
semantics that are not guaranteed by the model. Some of these checks include:
+* enforcing immutability of elements
+* enforcing encodability of elements
+* elements are processed in an arbitrary order at all points
+* serialization of user functions (`DoFn`, `CombineFn`, etc.)
+Using the Direct Runner for testing and development helps ensure that pipelines are robust
across different Beam runners. In addition, debugging failed runs can be a non-trivial task
when a pipeline executes on a remote cluster. Instead, it is often faster and simpler to perform
local unit testing on your pipeline code. Unit testing your pipeline locally also allows you
to use your preferred local debugging tools.
+Here are some resources with information about how to test your pipelines.
+* [Testing Unbounded Pipelines in Apache Beam]({{ site.baseurl }}/blog/2016/10/20/test-stream.html)
talks about the use of Java classes [`PAssert`]({{ site.baseurl }}/documentation/sdks/javadoc/{{
site.release_latest }}/index.html?org/apache/beam/sdk/testing/PAssert.html) and [`TestStream`]({{
site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/testing/TestStream.html)
to test your pipelines.
+* The [Apache Beam WordCount Example]({{ site.baseurl }}/get-started/wordcount-example/)
contains an example of logging and testing a pipeline with [`PAssert`]({{ site.baseurl }}/documentation/sdks/javadoc/{{
site.release_latest }}/index.html?org/apache/beam/sdk/testing/PAssert.html).
+## Direct Runner prerequisites and setup
+You must specify your dependency on the Direct Runner.
+   <groupId>org.apache.beam</groupId>
+   <artifactId>beam-runners-direct-java</artifactId>
+   <version>0.3.0-incubating</version>
+   <scope>runtime</scope>
+## Pipeline options for the Direct Runner
+When executing your pipeline from the command-line, set `runner` to `direct`. The default
values for the other pipeline options are generally sufficient.
+See the reference documentation for the  <span class="language-java">[`DirectOptions`]({{
site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/runners/direct/DirectOptions.html)</span><span
interface (and its subinterfaces) for defaults and the complete list of pipeline configuration
+## Additional information and caveats
+Local execution is limited by the memory available in your local environment. It is highly
recommended that you run your pipeline with data sets small enough to fit in local memory.
You can create a small in-memory data set using a <span class="language-java">[`Create`]({{
site.baseurl }}/documentation/sdks/javadoc/{{ site.release_latest }}/index.html?org/apache/beam/sdk/transforms/Create.html)</span><span
transform, or you can use a <span class="language-java">[`Read`]({{ site.baseurl }}/documentation/sdks/javadoc/{{
site.release_latest }}/index.html?org/apache/beam/sdk/io/Read.html)</span><span class="language-python">[`Read`](</span>
transform to work with small local or remote files.

View raw message