Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 87F6E180BF for ; Fri, 19 Feb 2016 08:34:58 +0000 (UTC) Received: (qmail 81123 invoked by uid 500); 19 Feb 2016 08:34:58 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 81034 invoked by uid 500); 19 Feb 2016 08:34:58 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 81017 invoked by uid 99); 19 Feb 2016 08:34:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Feb 2016 08:34:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id BE49018051D for ; Fri, 19 Feb 2016 08:34:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.429 X-Spam-Level: * X-Spam-Status: No, score=1.429 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id UVe8WF_cqcJC for ; Fri, 19 Feb 2016 08:34:56 +0000 (UTC) Received: from mail-ig0-f169.google.com (mail-ig0-f169.google.com [209.85.213.169]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id CDE915F1E5 for ; Fri, 19 Feb 2016 08:34:55 +0000 (UTC) Received: by mail-ig0-f169.google.com with SMTP id g6so33698989igt.1 for ; Fri, 19 Feb 2016 00:34:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=BTgE6iztQ4UNr3J5KLim98QHYpMtSI+nevg7UUMcMWM=; b=Jyp2kLKj5Xh4W3JtSzH4YHSdjl1238KszP3YJXNLDd8KJzCdthW0Rvb8rbaG30xqNq Buibo6cU8036H3DRkd2e/5uida2rxrdqJ2ENPL9Bhr1WrvRUiiHCmftzTALVHSd1Hw3g +fgwex7Zew2gXopvDFeBiK9s81A81U10GkvHt07N8WASmjIAL9IriEoKKLPi0LEFhMkY IkOy4b/9wLpvkdLYVZyJAP/RSDBS6xVUkkyApYz9FCiUbrTndoV1SeKzJPj02BUpjEGH 3sVQVHofx1+Tq8fLeS8h+XiW0WL7W8BpIpnM4RzhNC0RqHe0FQOQa4cXWc7BFvYF8r4g ihMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=BTgE6iztQ4UNr3J5KLim98QHYpMtSI+nevg7UUMcMWM=; b=gMuy7jufixfLRNI0RXFYE3DrcTPSoAWiELF5GaXi8CdvkaC+PQxkOL5PZ9ji59HJvU qwyFu3Lk+A1KjNfauHMigdMmUfV2+uDNYtwNhCTMfhCDicn58jdYVOxbKad6t6VRki3y WlV+56ZhfT2juBEVfITg9odph9TyhjI31HAeVyjSg1D/rXD+9XGEt9uwFFMl6bKYK0x3 IDXebWMQi1m2z4qKFBiAd6Yh7NiuEd+IkMJWYzHHRCSjfoXaiS3v/AWJz47ppUOXIXZj kGuQm+XLlNfTEuUR8tNL3NO8pG17nxI3BRQAP+vAfxkQSLuCa68feYVfpuwS19VgqGNI 3u0Q== X-Gm-Message-State: AG10YOSh/X5c6XSoQLWj8Q+ZB+W3WWYhht1xxJcAHdxgvEY+FeedxLyzfcnEAC9huZ/25ZhSNo6Y8Tbeq1Bkig== MIME-Version: 1.0 X-Received: by 10.50.142.42 with SMTP id rt10mr8180648igb.14.1455870889437; Fri, 19 Feb 2016 00:34:49 -0800 (PST) Received: by 10.107.1.197 with HTTP; Fri, 19 Feb 2016 00:34:49 -0800 (PST) Date: Fri, 19 Feb 2016 15:34:49 +0700 Message-ID: Subject: Optimal Configuration for Cluster From: Welly Tambunan To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a11c2eb20343338052c1b5be0 --001a11c2eb20343338052c1b5be0 Content-Type: text/plain; charset=UTF-8 Hi All, We are trying to running our job in cluster that has this information 1. # of machine: 16 2. memory : 128 gb 3. # of core : 48 However when we try to run we have an exception. "insufficient number of network buffers. 48 required but only 10 available. the total number of network buffers is currently set to 2048" After looking at the documentation we set configuration based on docs taskmanager.network.numberOfBuffers: # core ^ 2 * # machine * 4 However we face another error from JVM java.io.IOException: Cannot allocate network buffer pool: Could not allocate enough memory segments for NetworkBufferPool (required (Mb): 2304, allocated (Mb): 698, missing (Mb): 1606). Cause: Java heap space We fiddle the taskmanager.heap.mb: 4096 Finally the cluster is running. However i'm still not sure about the configuration and fiddling in task manager heap really fine tune. So my question is 1. Am i doing it right for numberOfBuffers ? 2. How much should we allocate on taskmanager.heap.mb given the information 3. Any suggestion which configuration we need to set to make it optimal for the cluster ? 4. Is there any chance that this will get automatically resolve by memory/network buffer manager ? Thanks a lot for the help Cheers -- Welly Tambunan Triplelands http://weltam.wordpress.com http://www.triplelands.com --001a11c2eb20343338052c1b5be0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi All,=C2=A0

We are= trying to running our job in cluster that has this information

1. # of machine: 16=C2=A0
2. memory : 128 gb=C2=A0
3. # of = core : 48=C2=A0

However when we try to run w= e have an exception.=C2=A0

"insufficient number of network buff= ers. 48 required but only 10 available. the total number of network buffers= is currently set to 2048"

After looking = at the documentation we set configuration based on docs

taskmanager.network.numberOfBuffers: # core ^ 2 * # machine * 4=C2=A0=

However we face another error from JVM

jav= a.io.IOException: Cannot allocate network buffer pool: Could not allocate e= nough memory segments for NetworkBufferPool (required (Mb): 2304, allocated= (Mb): 698, missing (Mb): 1606). Cause: Java heap space

We fiddle th= e=C2=A0taskmanager.heap.mb:=C2=A04096

Finally the = cluster is running.=C2=A0

However i'm still no= t sure about the configuration and fiddling in task manager heap really fin= e tune. So my question is

  1. Am i doing it right for numberOfBuffers ?
  2. How much should we allocate on taskmanager.heap.mb given the i= nformation
  3. Any suggestion which configur= ation we need to set to make it optimal for the cluster ?=C2=A0
  4. Is there any chance that this will get automaticall= y resolve by memory/network buffer manager ?
Thanks a lot for= the help

Cheers

-= -
--001a11c2eb20343338052c1b5be0--