Return-Path: X-Original-To: apmail-flink-commits-archive@minotaur.apache.org Delivered-To: apmail-flink-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D0A6A198D7 for ; Tue, 26 Apr 2016 14:59:42 +0000 (UTC) Received: (qmail 13148 invoked by uid 500); 26 Apr 2016 14:59:37 -0000 Delivered-To: apmail-flink-commits-archive@flink.apache.org Received: (qmail 13109 invoked by uid 500); 26 Apr 2016 14:59:37 -0000 Mailing-List: contact commits-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list commits@flink.apache.org Received: (qmail 13100 invoked by uid 99); 26 Apr 2016 14:59:37 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Apr 2016 14:59:37 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id B4AECDFE5F; Tue, 26 Apr 2016 14:59:36 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: rmetzger@apache.org To: commits@flink.apache.org Message-Id: X-Mailer: ASF-Git Admin Mailer Subject: flink git commit: [FLINK-3804][docs] Fix YARN and Kafka docs Date: Tue, 26 Apr 2016 14:59:36 +0000 (UTC) Repository: flink Updated Branches: refs/heads/master ac2137cfa -> e293e68c9 [FLINK-3804][docs] Fix YARN and Kafka docs Project: http://git-wip-us.apache.org/repos/asf/flink/repo Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/e293e68c Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/e293e68c Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/e293e68c Branch: refs/heads/master Commit: e293e68c97647e2ad970585438d307dc9b004101 Parents: ac2137c Author: Robert Metzger Authored: Mon Apr 25 11:13:16 2016 +0200 Committer: Robert Metzger Committed: Tue Apr 26 16:59:15 2016 +0200 ---------------------------------------------------------------------- docs/apis/streaming/connectors/kafka.md | 7 ++++++- docs/setup/yarn_setup.md | 8 ++++++-- 2 files changed, 12 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flink/blob/e293e68c/docs/apis/streaming/connectors/kafka.md ---------------------------------------------------------------------- diff --git a/docs/apis/streaming/connectors/kafka.md b/docs/apis/streaming/connectors/kafka.md index bc8c727..da3f86c 100644 --- a/docs/apis/streaming/connectors/kafka.md +++ b/docs/apis/streaming/connectors/kafka.md @@ -95,7 +95,7 @@ Note that the streaming connectors are currently not part of the binary distribu #### Kafka Consumer -Flink's Kafka consumer is called `FlinkKafkaConsumer08` (or `09`). It provides access to one or more Kafka topics. +Flink's Kafka consumer is called `FlinkKafkaConsumer08` (or `09` for Kafka 0.9.0.x versions). It provides access to one or more Kafka topics. The constructor accepts the following arguments: @@ -136,6 +136,11 @@ stream = env +The current FlinkKafkaConsumer implementation will establish a connection from the client (when calling the constructor) +for querying the list of topics and partitions. + +For this to work, the consumer needs to be able to access the consumers from the machine submitting the job to the Flink cluster. +If you experience any issues with the Kafka consumer on the client side, the client log might contain information about failed requests, etc. ##### The `DeserializationSchema` http://git-wip-us.apache.org/repos/asf/flink/blob/e293e68c/docs/setup/yarn_setup.md ---------------------------------------------------------------------- diff --git a/docs/setup/yarn_setup.md b/docs/setup/yarn_setup.md index c46a6f6..f24f451 100644 --- a/docs/setup/yarn_setup.md +++ b/docs/setup/yarn_setup.md @@ -123,12 +123,16 @@ Flink on YARN will overwrite the following configuration parameters `jobmanager. If you don't want to change the configuration file to set configuration parameters, there is the option to pass dynamic properties via the `-D` flag. So you can pass parameters this way: `-Dfs.overwrite-files=true -Dtaskmanager.network.numberOfBuffers=16368`. -The example invocation starts 11 containers, since there is one additional container for the ApplicationMaster and Job Manager. +The example invocation starts 11 containers (even though only 10 containers were requested), since there is one additional container for the ApplicationMaster and Job Manager. Once Flink is deployed in your YARN cluster, it will show you the connection details of the Job Manager. Stop the YARN session by stopping the unix process (using CTRL+C) or by entering 'stop' into the client. +Flink on YARN will only start all requested containers if enough resources are available on the cluster. Most YARN schedulers account for the requested memory of the containers, +some account also for the number of vcores. By default, the number of vcores is equal to the processing slots (`-s`) argument. The `yarn.containers.vcores` allows overwriting the +number of vcores with a custom value. + #### Detached YARN Session If you do not want to keep the Flink YARN client running all the time, it's also possible to start a *detached* YARN session. @@ -288,6 +292,6 @@ When starting a new Flink YARN session, the client first checks if the requested The next step of the client is to request (step 2) a YARN container to start the *ApplicationMaster* (step 3). Since the client registered the configuration and jar-file as a resource for the container, the NodeManager of YARN running on that particular machine will take care of preparing the container (e.g. downloading the files). Once that has finished, the *ApplicationMaster* (AM) is started. -The *JobManager* and AM are running in the same container. Once they successfully started, the AM knows the address of the JobManager (its own host). It is generating a new Flink configuration file for the TaskManagers (so that they can connect to the JobManager). The file is also uploaded to HDFS. Additionally, the *AM* container is also serving Flink's web interface. The ports Flink is using for its services are the standard ports configured by the user + the application id as an offset. This allows users to execute multiple Flink YARN sessions in parallel. +The *JobManager* and AM are running in the same container. Once they successfully started, the AM knows the address of the JobManager (its own host). It is generating a new Flink configuration file for the TaskManagers (so that they can connect to the JobManager). The file is also uploaded to HDFS. Additionally, the *AM* container is also serving Flink's web interface. All ports the YARN code is allocating are *ephemeral ports*. This allows users to execute multiple Flink YARN sessions in parallel. After that, the AM starts allocating the containers for Flink's TaskManagers, which will download the jar file and the modified configuration from the HDFS. Once these steps are completed, Flink is set up and ready to accept Jobs.