Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 80502200BC1 for ; Wed, 16 Nov 2016 18:36:17 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7EDE5160B02; Wed, 16 Nov 2016 17:36:17 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A2A6B160B08 for ; Wed, 16 Nov 2016 18:36:16 +0100 (CET) Received: (qmail 12225 invoked by uid 500); 16 Nov 2016 17:36:15 -0000 Mailing-List: contact commits-help@beam.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.incubator.apache.org Delivered-To: mailing list commits@beam.incubator.apache.org Received: (qmail 12216 invoked by uid 99); 16 Nov 2016 17:36:15 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Nov 2016 17:36:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 703E7C362C for ; Wed, 16 Nov 2016 17:36:15 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -6.219 X-Spam-Level: X-Spam-Status: No, score=-6.219 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id VzENGpvdxPxC for ; Wed, 16 Nov 2016 17:36:13 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 266A35F24C for ; Wed, 16 Nov 2016 17:36:11 +0000 (UTC) Received: (qmail 11962 invoked by uid 99); 16 Nov 2016 17:36:11 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Nov 2016 17:36:11 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 2E626E2F01; Wed, 16 Nov 2016 17:36:11 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: davor@apache.org To: commits@beam.incubator.apache.org Date: Wed, 16 Nov 2016 17:36:11 -0000 Message-Id: <41a29c5fa55a4202894940a771420cb8@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [1/3] incubator-beam-site git commit: [BEAM-899] Add Flink Instructions to quickstart.md archived-at: Wed, 16 Nov 2016 17:36:17 -0000 Repository: incubator-beam-site Updated Branches: refs/heads/asf-site 3b93d1b56 -> 4e6947c63 [BEAM-899] Add Flink Instructions to quickstart.md Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/a53e4d88 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/a53e4d88 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/a53e4d88 Branch: refs/heads/asf-site Commit: a53e4d88cebdcaab45d5d7090f40fd4d986c1151 Parents: 3b93d1b Author: Aljoscha Krettek Authored: Sat Nov 12 11:36:49 2016 +0100 Committer: Aljoscha Krettek Committed: Wed Nov 16 18:08:41 2016 +0100 ---------------------------------------------------------------------- src/get-started/quickstart.md | 73 +++++++++++++++++++++++++++++--------- 1 file changed, 56 insertions(+), 17 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/a53e4d88/src/get-started/quickstart.md ---------------------------------------------------------------------- diff --git a/src/get-started/quickstart.md b/src/get-started/quickstart.md index 0a956d2..07ace51 100644 --- a/src/get-started/quickstart.md +++ b/src/get-started/quickstart.md @@ -7,7 +7,7 @@ redirect_from: - /getting-started/ --- -# Apache Beam Java SDK Quickstart +# Apache Beam Java SDK Quickstart This Quickstart will walk you through executing your first Beam pipeline to run [WordCount]({{ site.baseurl }}/get-started/wordcount-example), written using Beam's [Java SDK]({{ site.baseurl }}/documentation/sdks/java), on a [runner]({{ site.baseurl }}/documentation#runners) of your choice. @@ -16,7 +16,7 @@ This Quickstart will walk you through executing your first Beam pipeline to run ## Set up your Development Environment - + 1. Download and install the [Java Development Kit (JDK)](http://www.oracle.com/technetwork/java/javase/downloads/index.html) version 1.7 or later. Verify that the [JAVA_HOME](https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/envvars001.html) environment variable is set and points to your JDK installation. 1. Download and install [Apache Maven](http://maven.apache.org/download.cgi) by following Maven's [installation guide](http://maven.apache.org/install.html) for your specific operating system. @@ -24,7 +24,7 @@ This Quickstart will walk you through executing your first Beam pipeline to run ## Get the WordCount Code -The easiest way to get a copy of the WordCount pipeline is to use the following command to generate a simple Maven project that contains Beam's WordCount examples and builds against the most recent Beam release: +The easiest way to get a copy of the WordCount pipeline is to use the following command to generate a simple Maven project that contains Beam's WordCount examples and builds against the most recent Beam release: ``` $ mvn archetype:generate \ @@ -38,7 +38,7 @@ $ mvn archetype:generate \ -Dpackage=org.apache.beam.examples ``` -This will create a directory `word-count-beam` that contains a simple `pom.xml` and a series of example pipelines that count words in text files. +This will create a directory `word-count-beam` that contains a simple `pom.xml` and a series of example pipelines that count words in text files. ``` $ cd beam-word-count/ @@ -63,7 +63,7 @@ After you've chosen which runner you'd like to use: 1. Ensure you've done any runner-specific setup. 1. Build your commandline by: 1. Specifying a specific runner with `--runner=` (defaults to the [DirectRunner]({{ site.baseurl }}/documentation/runners/direct)) - 1. Adding any runner-specific required options + 1. Adding any runner-specific required options 1. Choosing input files and an output location are accessible on the chosen runner. (For example, you can't access a local file if you are running the pipeline on an external cluster.) 1. Run your first WordCount pipeline. @@ -74,14 +74,27 @@ $ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ ``` {:.runner-apex} -``` +``` $ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ -Dexec.args="--inputFile=pom.xml --output=counts --runner=ApexRunner" -Papex-runner ``` -{:.runner-flink} -``` -TODO BEAM-899 +{:.runner-flink-local} +``` +$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ + -Dexec.args="--runner=FlinkRunner --inputFile=pom.xml --output=counts" -Pflink-runner +``` + +{:.runner-flink-cluster} +``` +$ mvn package -Pflink-runner +$ cp target/word-count-beam-bundled-0.1.jar /path/to/flink/lib/ +$ bin/flink run -c org.apache.beam.examples.WordCount lib/word-count-beam-0.1.jar \ + --inputFile=/path/to/quickstart/pom.xml \ + --output=/tmp/counts \ + --runner=org.apache.beam.runners.flink.FlinkRunner + +You can monitor the running job by visiting the Flink dashboard at http://:8081 ``` {:.runner-spark} @@ -111,9 +124,14 @@ $ ls counts* $ ls counts* ``` -{:.runner-flink} -``` -TODO BEAM-899 +{:.runner-flink-local} +``` +$ ls counts* +``` + +{:.runner-flink-cluster} +``` +$ ls /tmp/counts* ``` {:.runner-spark} @@ -126,7 +144,7 @@ TODO BEAM-900 ``` $ gsutil ls gs:///counts* ``` - + When you look into the contents of the file, you'll see that they contain unique words and the number of occurrences of each word. The order of elements within the file may differ because the Beam model does not generally guarantee ordering, again to allow runners to optimize for efficiency. {:.runner-direct} @@ -153,9 +171,30 @@ PAssert: 1 ... ``` -{:.runner-flink} -``` -TODO BEAM-899 +{:.runner-flink-local} +``` +$ more counts* +The: 1 +api: 9 +old: 4 +Apache: 2 +limitations: 1 +bundled: 1 +Foundation: 1 +... +``` + +{:.runner-flink-cluster} +``` +$ more /tmp/counts* +The: 1 +api: 9 +old: 4 +Apache: 2 +limitations: 1 +bundled: 1 +Foundation: 1 +... ``` {:.runner-spark} @@ -184,4 +223,4 @@ barrenly: 1 * Join the Beam [users@]({{ site.baseurl }}/get-started/support#mailing-lists) mailing list. Please don't hesitate to [reach out]({{ site.baseurl }}/get-started/support) if you encounter any issues! - +