Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B6BBD200C68 for ; Wed, 3 May 2017 12:19:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id B55B0160BB5; Wed, 3 May 2017 10:19:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 07146160BAA for ; Wed, 3 May 2017 12:19:06 +0200 (CEST) Received: (qmail 68432 invoked by uid 500); 3 May 2017 10:19:06 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 68423 invoked by uid 99); 3 May 2017 10:19:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 May 2017 10:19:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D605118F165 for ; Wed, 3 May 2017 10:19:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Xt2OjhUzHIhT for ; Wed, 3 May 2017 10:19:05 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id DB8255FD29 for ; Wed, 3 May 2017 10:19:04 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7D7F9E05BF for ; Wed, 3 May 2017 10:19:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 3661821DE2 for ; Wed, 3 May 2017 10:19:04 +0000 (UTC) Date: Wed, 3 May 2017 10:19:04 +0000 (UTC) From: "Chesnay Schepler (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Assigned] (FLINK-4545) Flink automatically manages TM network buffer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 03 May 2017 10:19:07 -0000 [ https://issues.apache.org/jira/browse/FLINK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-4545: --------------------------------------- Assignee: Nico Kruber > Flink automatically manages TM network buffer > --------------------------------------------- > > Key: FLINK-4545 > URL: https://issues.apache.org/jira/browse/FLINK-4545 > Project: Flink > Issue Type: Wish > Components: Network > Reporter: Zhenzhong Xu > Assignee: Nico Kruber > Priority: Blocker > Fix For: 1.3.0 > > > Currently, the number of network buffer per task manager is preconfigured and the memory is pre-allocated through taskmanager.network.numberOfBuffers config. In a Job DAG with shuffle phase, this number can go up very high depends on the TM cluster size. The formula for calculating the buffer count is documented here (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers). > #slots-per-TM^2 * #TMs * 4 > In a standalone deployment, we may need to control the task manager cluster size dynamically and then leverage the up-coming Flink feature to support scaling job parallelism/rescaling at runtime. > If the buffer count config is static at runtime and cannot be changed without restarting task manager process, this may add latency and complexity for scaling process. I am wondering if there is already any discussion around whether the network buffer should be automatically managed by Flink or at least expose some API to allow it to be reconfigured. Let me know if there is any existing JIRA that I should follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)