Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2E52D200C36 for ; Fri, 10 Mar 2017 15:38:54 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 2CE43160B79; Fri, 10 Mar 2017 14:38:54 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2A85B160B69 for ; Fri, 10 Mar 2017 15:38:53 +0100 (CET) Received: (qmail 95023 invoked by uid 500); 10 Mar 2017 14:38:52 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 95013 invoked by uid 99); 10 Mar 2017 14:38:52 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Mar 2017 14:38:52 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id BB96FC014F for ; Fri, 10 Mar 2017 14:38:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.998 X-Spam-Level: * X-Spam-Status: No, score=1.998 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id PMzaSrfneoTZ for ; Fri, 10 Mar 2017 14:38:50 +0000 (UTC) Received: from vaadcmhout02.cable.comcast.com (vaadcmhout02.cable.comcast.com [96.114.28.76]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 442B760DC0 for ; Fri, 10 Mar 2017 14:38:50 +0000 (UTC) X-AuditID: 60721c4c-61fff70000007eaf-14-58c2ba55ea4d Received: from VAADCEX11.cable.comcast.com (vaadcmhoutvip.cable.comcast.com [96.115.73.56]) (using TLS with cipher AES256-SHA256 (256/256 bits)) (Client did not present a certificate) by (SMTP Gateway) with SMTP id 07.F8.32431.55AB2C85; Fri, 10 Mar 2017 09:38:14 -0500 (EST) Received: from VAADCEX16.cable.comcast.com (147.191.102.83) by VAADCEX11.cable.comcast.com (147.191.102.78) with Microsoft SMTP Server (TLS) id 15.0.1263.5; Fri, 10 Mar 2017 09:38:13 -0500 Received: from VAADCEX16.cable.comcast.com ([fe80::3aea:a7ff:fe12:e160]) by VAADCEX16.cable.comcast.com ([fe80::3aea:a7ff:fe12:e160%19]) with mapi id 15.00.1263.000; Fri, 10 Mar 2017 09:38:13 -0500 From: "Torok, David" To: "user@flink.apache.org" Subject: Reference configs for HA / RocksDB / YARN / Zookeeper / HDFS Thread-Topic: Reference configs for HA / RocksDB / YARN / Zookeeper / HDFS Thread-Index: AdKY5uVaNGH5CXNtQ3OC77OrmEKkGQ== Date: Fri, 10 Mar 2017 14:38:13 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [68.87.29.11] Content-Type: multipart/alternative; boundary="_000_da9ff80de1724545b3604c6fd9d5f43dVAADCEX16cablecomcastco_" MIME-Version: 1.0 X-CFilter-Loop: Forward X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprAIsWRmVeSWpSXmKPExsWSUOxpoRu261CEwfxJmhb9S68xOzB67D2x nDGAMYrLJiU1J7MstUjfLoErY9nfB6wF0/0qHhzdzNzA+N25i5GTQ0LAROL32lY2EFtIYCaT xPW7WRD2AUaJuY9Luxi5gOyTjBKfT01jAUmwCehJ3G7/ywhiiwgYS3xaeAIsLizgKnF7ykM2 iLiXxI2je5ghbD2Joxt3gtWzCKhKTF3fww5i8wLVHDy+E6yeUUBM4vupNUwgNrOAuMStJ/OZ II4TkFiy5zwzhC0q8fLxP1YI20Bi69J9QHs5gGx5iY9zmUBMZoF8iZYDBhDTBSVOznzCAlEt LnH4yA7WCYwis5AsmIXQMQtJB0SJjsSC3Z/YIGxtiWULXzPD2GcOPGZCFl/AyL6KUa4sMTEl OTcjv7TEwEgvOTEpJ1UvOT83ObG4BERvYgTFU5GMzw7GT9M8DjEKcDAq8fAeX3MoQog1say4 MvcQowQHs5IIr8d8oBBvSmJlVWpRfnxRaU5q8SFGaQ4WJXHeiyKrI4QE0hNLUrNTUwtSi2Cy TBycUg2MzDu+Nq4TYvH1sxG/tWn/rEcaLzaymhbJnA/d6NnuN9nmpfixiHWVlhfXKJhkf+Lw vJd1sP28Ljevv8Pv7lTWL+XKQhOqjq9tZmZqn7nuw4dVnjfPh4s39usvnlTMPXf5T71wLdkz nCq5pr6tF6P/Xv43Y03T7ad3F329dv+gWt07cQ+zWo1YJZbijERDLeai4kQAjXSxUqMCAAA= archived-at: Fri, 10 Mar 2017 14:38:54 -0000 --_000_da9ff80de1724545b3604c6fd9d5f43dVAADCEX16cablecomcastco_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, Forgive me if parts of this question have been answered before but I'd like= help in resolving some bits of confusion from the documentation and the fa= ct that I haven't been able to find a good example anywhere for an enterpri= se-style setup. If anyone has a sample HA / Yarn / ZK / RocksDB configurat= ion could you share? We are currently using Flink 1.2.0 and Hortonworks (an older version, 2.2.9= based on Hadoop 2.6.0). We're trying a small sample cluster with 9 Yarn c= lient nodes. 1. We have large state and large time-windows and therefore want to u= se RocksDB as our state backend. Is it a typical or best practice that Roc= ksDB store to local-disk storage for speed, and the checkpoints store to HD= FS for recovery / HA? Or is everything in HDFS? So from my understanding = from the docs, "The RocksDBStateBackend holds in-flight data in a RocksDB data base that is (per default) stored in the TaskManag= er data directories"... (is this set automatic via YARN?)... and the check= point directory is via "state.backend.fs.checkpointdir: hdfs://namenode:400= 10/flink/checkpoints" or dynamically e.g. new RocksDBStateBackend(statepath= ). 2. It's unclear to me whether Yarn automatically provides Flink with = the Zookeeper information, or whether I also need to set the zookeeper info= in flink-conf.yaml... the examples seem to imply that the ZK information m= ight only be used if you start your own Zookeeper rather than it already ex= isting. Do I need to set it up for HA via YARN? 3. I've seen some conflicting information about including HADOOP_CLAS= SPATH - some say there are many conflicts with Flink libraries whereas othe= rs say it's important to resolve various deserialization errors during runt= ime. 4. Someone suggested that we build Flink from source ourselves agains= t the Hortonworks distribution; I'm really hoping that's not necessary. Appreciate any info as we learn how to productionize our Flink clusters! Best Regards Dave --_000_da9ff80de1724545b3604c6fd9d5f43dVAADCEX16cablecomcastco_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

 

Forgive me if parts of this question have been answe= red before but I’d like help in resolving some bits of confusion from= the documentation and the fact that I haven’t been able to find a go= od example anywhere for an enterprise-style setup.  If anyone has a sample HA / Yarn / ZK / RocksDB configuration could you sh= are?

 

We are currently using Flink 1.2.0 and Hortonworks (= an older version, 2.2.9 based on Hadoop 2.6.0).  We’re trying a = small sample cluster with 9 Yarn client nodes.

 

1.     &= nbsp; We have large state and large time-windows and ther= efore want to use RocksDB as our state backend.  Is it a typical or be= st practice that RocksDB store to local-disk storage for speed, and the che= ckpoints store to HDFS for recovery / HA?  Or is everything in HDFS?  So from my understanding from th= e docs, “The RocksDBSt= ateBackend holds in-flight data in a&= nbsp;RocksDB&= nbsp;data base that is (per default) stored in the TaskManager data directories̶= 1;…  (is this set automatic via YARN?)… and the checkpoint= directory is via “state.backend.fs.checkpointdir: hdfs://namenode:40= 010/flink/checkpoints” or dynamically e.g. new RocksDBStateBackend(st= atepath).

2.     &= nbsp; It&#= 8217;s unclear to me whether Yarn automatically provides Flink with the Zoo= keeper information, or whether I also need to set the zookeeper info in fli= nk-conf.yaml… the examples seem to imply that the ZK information might only be used if you start your own Zookeeper= rather than it already existing.  Do I need to set it up for HA via Y= ARN?

3.     &= nbsp; I’ve seen some conflicting information about = including HADOOP_CLASSPATH – some say there are many conflicts with F= link libraries whereas others say it’s important to resolve various d= eserialization errors during runtime.

4.     &= nbsp; Someone suggested that we build Flink from source o= urselves against the Hortonworks distribution; I’m really hoping that= ’s not necessary.

 

Appreciate any info as we learn how to productionize= our Flink clusters!

 

Best Regards

Dave

--_000_da9ff80de1724545b3604c6fd9d5f43dVAADCEX16cablecomcastco_--