Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 049BC19E7D for ; Tue, 19 Apr 2016 09:55:35 +0000 (UTC) Received: (qmail 28927 invoked by uid 500); 19 Apr 2016 09:55:34 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 28834 invoked by uid 500); 19 Apr 2016 09:55:34 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 28804 invoked by uid 99); 19 Apr 2016 09:55:34 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Apr 2016 09:55:34 +0000 Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 192531A0515 for ; Tue, 19 Apr 2016 09:55:34 +0000 (UTC) Received: by mail-wm0-f47.google.com with SMTP id u206so20114482wme.1 for ; Tue, 19 Apr 2016 02:55:33 -0700 (PDT) X-Gm-Message-State: AOPr4FVgYL92psMdSF1BIrPkWRomnkB+NDmz6C2wCWudNPhoPLWqvPoUBPbpehAKByGB2HhLRd9/AUszKTq4tg== MIME-Version: 1.0 X-Received: by 10.194.179.168 with SMTP id dh8mr2464320wjc.130.1461059732436; Tue, 19 Apr 2016 02:55:32 -0700 (PDT) Received: by 10.194.42.194 with HTTP; Tue, 19 Apr 2016 02:55:32 -0700 (PDT) In-Reply-To: References: Date: Tue, 19 Apr 2016 11:55:32 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Does Flink support joining multiple streams based on event time window now? From: Till Rohrmann To: user@flink.apache.org Content-Type: multipart/alternative; boundary=089e0102e974590d9e0530d37ae7 --089e0102e974590d9e0530d37ae7 Content-Type: text/plain; charset=UTF-8 Hi Yifei, if you don't wanna implement your own join operator, then you could also chain two join operations. I created a small example to demonstrate that: https://gist.github.com/tillrohrmann/c074b4eedb9deaf9c8ca2a5e124800f3. However, bare in mind that for this approach you will construct two windows which might be a bit more costly than Aljoscha's approach. Cheers, Till On Tue, Apr 19, 2016 at 11:32 AM, Aljoscha Krettek wrote: > Hi, > right now, there is no built-in support for n-ary joins. I am working on > this, however. > > For now you can simulate n-ary joins by using a tagged union and doing the > join yourself in a WindowFunction. I created a small example that > demonstrates this: > https://gist.github.com/aljoscha/a2a213d90c7c1bc67e71fabaa82fba4a > > I hope this helps, and please let us know if you want to know more. > > Cheers, > Aljoscha > > On Tue, 19 Apr 2016 at 02:11 Yifei Li wrote: > >> Hi, >> >> I am new to Flink and I've read some documentation and think Flink may >> fit my scenario. >> >> Here is my scenario: >> >> 1. Assume I have 3 streams: S1(id, name, email, action, date), S2(id, >> name, email, level, date), S3(id, name, position, date). >> >> *2. S2 always delays(hours to days, not determined..) * >> >> 3. Based on the event time, I want to join S1, S2 and S3 every 5 minutes. >> The join is like a SQL join: >> select S1.name, S3.position from S1, S2, S3 where S1.id = S2.id and >> S1.id = S3.id and S1.action = 'download' and S2.level = 5 >> >> >> >> Can I use Flink for my scenario? Is yes, can anyone point me to some >> working examples(I found some examples but they are outdated), or tell me >> some workaround to solve this problem? If no, can anyone tell me the >> reasons? >> >> Thanks, >> >> Yifei >> >> >> >> >> >> --089e0102e974590d9e0530d37ae7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Yifei,

if you don't wanna implem= ent your own join operator, then you could also chain two join operations. = I created a small example to demonstrate that:=C2=A0https://gist.git= hub.com/tillrohrmann/c074b4eedb9deaf9c8ca2a5e124800f3. However, bare in= mind that for this approach you will construct two windows which might be = a bit more costly than Aljoscha's approach.

Ch= eers,
Till

On Tue, Apr 19, 2016 at 11:32 AM, Aljoscha Krettek <alj= oscha@apache.org> wrote:
Hi,
right now, there is no built-in support for n-ary jo= ins. I am working on this, however.

For now you ca= n simulate n-ary joins by using a tagged union and doing the join yourself = in a WindowFunction. I created a small example that demonstrates this:=C2= =A0https://gist.github.com/aljoscha/a2a213d90c7c1bc67e= 71fabaa82fba4a

I hope this helps, and please l= et us know if you want to know more.

Cheers,
=
Aljoscha

On Tue, 19 Apr 2016 at 02:11 Yifei Li= <lee891031@gma= il.com> wrote:
Hi,

I am new to Flink and I've read some documen= tation and think Flink may fit my scenario.

Here i= s my scenario:

1. Assume I have 3 streams: S1(id, = name, email, action, date), S2(id, name, email, level, date), S3(id, name, = position, date).

2. S2 always delays(hours to d= ays, not determined..)=C2=A0

3. Based on = the event time, I want to join S1, S2 and S3 every 5 minutes. The join is l= ike a SQL join:
=C2=A0 =C2=A0 select S1.name, S3.position from S1= , S2, S3 where S1.id =3D S2.id and S1.id =3D S3.id and S1.action =3D 'd= ownload' and S2.level =3D 5



Can I use Flink for my scenario? Is yes, can anyone point= me to some working examples(I found some examples but they are outdated), = or tell me some workaround to solve this problem? If no, can anyone tell me= the reasons?

Thanks,

Yif= ei=C2=A0



=C2=A0


--089e0102e974590d9e0530d37ae7--