Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8655616663D for ; Tue, 22 Aug 2017 09:39:55 +0200 (CEST) Received: (qmail 13731 invoked by uid 500); 22 Aug 2017 07:39:54 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 13722 invoked by uid 99); 22 Aug 2017 07:39:54 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Aug 2017 07:39:54 +0000 Received: from mail-qk0-f176.google.com (mail-qk0-f176.google.com [209.85.220.176]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id F29261A070D for ; Tue, 22 Aug 2017 07:39:53 +0000 (UTC) Received: by mail-qk0-f176.google.com with SMTP id 130so58829178qkg.2 for ; Tue, 22 Aug 2017 00:39:53 -0700 (PDT) X-Gm-Message-State: AHYfb5gispYdKgfsiFpi7IoHp6/ZTWwX0puaj2yarrMgpvkzV138JmNY UJS8Zk8W0qdMYR420p9Q7qJzuF9qxw== X-Received: by 10.55.86.199 with SMTP id k190mr26047542qkb.125.1503387591299; Tue, 22 Aug 2017 00:39:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.168.214 with HTTP; Tue, 22 Aug 2017 00:39:10 -0700 (PDT) In-Reply-To: References: From: Till Rohrmann Date: Tue, 22 Aug 2017 09:39:10 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: how is data partitoned and distributed for connected stream To: xie wei Cc: user Content-Type: multipart/alternative; boundary="001a114e6f3056d9da055752b3d6" --001a114e6f3056d9da055752b3d6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, if all operators have the same parallelism, then there will be a pointwise connection. This means all elements arriving at s1_x and s2_x will be forwarded to s3_x with _x denoting the parallel subtask. Thus, to answer your second question, the single s1 element will only be present at one subtask of the CoMap operator, depending from which s1 parallel subtask it comes. Cheers, Till On Tue, Aug 22, 2017 at 8:31 AM, xie wei wrote: > Hello Flink=EF=BC=8C > > assume there are two finite streams, stream1(s1)has only one event, > stream2(s2)have 100 events, the parallelism is 2. > Then doing stream1.connect(stream2).map(). > How is the data partitioned and distributed to the CoMap instances? Is th= e > event from s1 only available in one of the CoMap instance? > Thank you! > > Best regards > Wei > > > --001a114e6f3056d9da055752b3d6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

if all operators have the same para= llelism, then there will be a pointwise connection. This means all elements= arriving at s1_x and s2_x will be forwarded to s3_x with _x denoting the p= arallel subtask. Thus, to answer your second question, the single s1 elemen= t will only be present at one subtask of the CoMap operator, depending from= which s1 parallel subtask it comes.

Cheers,
Till

On Tue, Aug 22, 2017 at 8:31 AM, xie wei <jixian01@googlemai= l.com> wrote:
Hello Flink=EF=BC=8C

assume there= are two finite streams, stream1(s1)has only one event, stream2(s2)have 100= events, the parallelism is 2.
Then doing stream1.connect(str= eam2).map().
How is the data partitioned and distributed= to the CoMap instances? Is the event from s1 only available in one of the = CoMap instance?
Thank you!

Best regards
Wei



--001a114e6f3056d9da055752b3d6--