Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 895F0186F4 for ; Mon, 29 Jun 2015 15:58:15 +0000 (UTC) Received: (qmail 70276 invoked by uid 500); 29 Jun 2015 15:58:15 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 70203 invoked by uid 500); 29 Jun 2015 15:58:15 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 70193 invoked by uid 99); 29 Jun 2015 15:58:15 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jun 2015 15:58:15 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id B038FD0FB1 for ; Mon, 29 Jun 2015 15:58:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id jtAQAYGjYRRP for ; Mon, 29 Jun 2015 15:58:09 +0000 (UTC) Received: from mail-la0-f43.google.com (mail-la0-f43.google.com [209.85.215.43]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 98C342122D for ; Mon, 29 Jun 2015 15:58:08 +0000 (UTC) Received: by lagh6 with SMTP id h6so59287637lag.2 for ; Mon, 29 Jun 2015 08:58:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=36yQHOrs1T/My3xGZykJLm39BKevj56a6OS5XXfui9w=; b=bbrAZFmoQdLFmD0/Ld+4TRv/s/UHooh0MkiHJl6D0BjCtSTGOIUefMPFxFX/6pabyU y3GH4tWRx0JXfgnfcrufdVhtNhhgnAzvB+PA8osNViHP8iEs+49OR5JlYfkJUIZ8aUeW 8d8z03JtYUpcEhR11dgh3lqw1MiwP8W2dxE7/776eqBayHuxJuh12/QVtiX0PuIZU6x0 7tXY1vt0qlPGQPVUtUN/3wvHvWPIa/v+T9rLl3nAP1dmcOYAuVAkHZQ5EAOKkHb/xgOA XwzVLEC2ZkQUHDyab4yRQWDbsNLLNY0XLeysyTGEEyReM+NAFE1i9faZYzIAQSg0uuB1 GAlw== MIME-Version: 1.0 X-Received: by 10.112.164.66 with SMTP id yo2mr14857116lbb.33.1435593487096; Mon, 29 Jun 2015 08:58:07 -0700 (PDT) Received: by 10.152.225.171 with HTTP; Mon, 29 Jun 2015 08:58:07 -0700 (PDT) In-Reply-To: <55916779.2070409@informatik.hu-berlin.de> References: <55916779.2070409@informatik.hu-berlin.de> Date: Mon, 29 Jun 2015 17:58:07 +0200 Message-ID: Subject: Re: cogroup From: Fabian Hueske To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a11c32d5cd726fc0519aa27c2 --001a11c32d5cd726fc0519aa27c2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable If you just want to do the pairwise comparison try join(). Join is an inner join and will give you all pairs of elements with matching keys. For CoGroup, there is no other way than collecting one side in memory. Best, Fabian 2015-06-29 17:42 GMT+02:00 Matthias J. Sax : > Why do you not use a join? CoGroup seems not to be the right operator. > > -Matthias > > On 06/29/2015 05:40 PM, Michele Bertoni wrote: > > Hi I have a question on cogroup > > > > when I cogroup two dataset is there a way to compare each element on th= e > left with each element on the right (inside a group) without collecting o= ne > side? > > > > right now I am doing > > > > left.cogroup(right).where(0,1,2).equalTo(0,1,2){ > > (leftIterator, rightIterator, out) =3D> { > > val lSet =3D leftIterator.toSet // <=E2=80=94= =E2=80=94=E2=80=94=E2=80=94 toSet > > for(r <- rightIterator) > > for(l <- lSet) > > //do something > > } > > } > > > > I would like to avoid the toSet > > > > > > thanks for help > > > > --001a11c32d5cd726fc0519aa27c2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
If you just want to do the pairwise comparison t= ry join().
Join is an inner join and will give you all pairs of elements= with matching keys.
For CoGroup, there is no other way than colle= cting one side in memory.

Best, Fabian

2015-06-29 17:42 GMT+02:00 Matt= hias J. Sax <mjsax@informatik.hu-berlin.de>:
=
Why do you not use a join? CoGroup seems not= to be the right operator.

-Matthias

On 06/29/2015 05:40 PM, Michele Bertoni wrote:
> Hi I have a question on cogroup
>
> when I cogroup two dataset is there a way to compare each element on t= he left with each element on the right (inside a group) without collecting = one side?
>
> right now I am doing
>
> left.cogroup(right).where(0,1,2).equalTo(0,1,2){
>=C2=A0 =C2=A0 =C2=A0 =C2=A0(leftIterator, rightIterator, out) =3D> {=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0val lSet =3D lef= tIterator.toSet=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// <=E2=80=94=E2= =80=94=E2=80=94=E2=80=94 toSet
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0for(r <- righ= tIterator)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0for(l <- lSet)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0//do something
>=C2=A0 =C2=A0 =C2=A0 =C2=A0}
> }
>
> I would like to avoid the toSet
>
>
> thanks for help
>


--001a11c32d5cd726fc0519aa27c2--