Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7C188200AF5 for ; Thu, 2 Jun 2016 15:26:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7A6C5160A53; Thu, 2 Jun 2016 13:26:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C107916098A for ; Thu, 2 Jun 2016 15:26:07 +0200 (CEST) Received: (qmail 80737 invoked by uid 500); 2 Jun 2016 13:26:06 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 80727 invoked by uid 99); 2 Jun 2016 13:26:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Jun 2016 13:26:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 5FF6CC06CD for ; Thu, 2 Jun 2016 13:26:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.299 X-Spam-Level: * X-Spam-Status: No, score=1.299 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id buDJsG7veeM1 for ; Thu, 2 Jun 2016 13:26:04 +0000 (UTC) Received: from w1.tutanota.de (w1.tutanota.de [81.3.6.162]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id C655A5F368 for ; Thu, 2 Jun 2016 13:26:03 +0000 (UTC) Received: from localhost (unknown [127.0.0.1]) by w1.tutanota.de (Postfix) with ESMTP id 7D100FA8576 for ; Thu, 2 Jun 2016 13:26:03 +0000 (UTC) Received: from w1.tutanota.de ([127.0.0.1]) by localhost (w1.tutanota.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id s9jC4TnBEZJR for ; Thu, 2 Jun 2016 13:26:00 +0000 (UTC) Received: from w1.tutanota.de (unknown [127.0.0.1]) by w1.tutanota.de (Postfix) with ESMTP id 5C82AFA852F for ; Thu, 2 Jun 2016 13:26:00 +0000 (UTC) Date: Thu, 2 Jun 2016 14:26:00 +0100 (BST) From: To: Ufuk Celebi Cc: Message-ID: In-Reply-To: References: <> Subject: Re: Internal buffers MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_52842_1401601466.1464873960376" archived-at: Thu, 02 Jun 2016 13:26:08 -0000 ------=_Part_52842_1401601466.1464873960376 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Dear Ufuk, the wiki entry is exactly what i was looking for. I found it quite complicated to understand on a first attempt but i will dedicate some more time for it in the future. Thanks. Regards Leon 1. Jun 2016 13:06 by uce@apache.org: > There is this in the Wiki: > https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks > > Buffers for data exchange come from the network buffer pool (by > default 2048 * 32KB buffers). They are distributed to the running > tasks and each logical channel between tasks needs at least one > buffer. > > Tasks produce buffers, which are either consumed by the > a) NettyConnectionManager who has a Thread pool for network > communication shared by all tasks exchanging remote data (no Thread > per buffer), or > b) the consuming task thread (local exchange). > > Chained operators run in a single task and exchange records without > serialization. > > On Wed, Jun 1, 2016 at 11:54 AM, <> leon_mclare@tutanota.com> > wrote: >> I have a question regarding how tuples are buffered between (possibly >> chained) subtasks. >> >> Is it correct that there is a buffer for each vertex in the DAG of >> subtasks? >> Regardless of task slot sharing? If yes, then the primary optimization in >> this regard is operator chaining. >> >> Furthermore, how do these buffers translate into overhead? Is there a send >> thread and a receive thread per buffer, similar to Apache Storm? >> >> I could not find details concerning such buffers in the relevant >> subsection >> under Concepts. >> >> Thanks in advance. ------=_Part_52842_1401601466.1464873960376 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Dear Ufuk,

the wiki entry is exactly what i was looking for. I f= ound it quite complicated to understand on a first attempt but i will dedic= ate some more time for it in the future.

Thanks.

Rega= rds
Leon

1. Jun 2016 13:06 by uce@apache.org:

There is this in the Wiki:
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+= between+tasks

Buffers for data exchange come from the networ= k buffer pool (by
default 2048 * 32KB buffers). They are distributed t= o the running
tasks and each logical channel between tasks needs at le= ast one
buffer.

Tasks produce buffers, which are either con= sumed by the
a) NettyConnectionManager who has a Thread pool for netwo= rk
communication shared by all tasks exchanging remote data (no Thread=
per buffer), or
b) the consuming task thread (local exchange).
Chained operators run in a single task and exchange records witho= ut
serialization.

On Wed, Jun 1, 2016 at 11:54 AM, <leon_mclare@tuta= nota.com> wrote:
I have a question regarding how tuples a= re buffered between (possibly
chained) subtasks.

Is it corr= ect that there is a buffer for each vertex in the DAG of subtasks?
Reg= ardless of task slot sharing? If yes, then the primary optimization in
this regard is operator chaining.

Furthermore, how do these buf= fers translate into overhead? Is there a send
thread and a receive thr= ead per buffer, similar to Apache Storm?

I could not find detail= s concerning such buffers in the relevant subsection
under Concepts.
Thanks in advance.
------=_Part_52842_1401601466.1464873960376--