kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xavier Léauté <xav...@confluent.io>
Subject Re: [Vote] KIP-150 - Kafka-Streams Cogroup
Date Wed, 24 May 2017 05:44:45 GMT
I don't think we should wait for entries from each stream, since that might
limit the usefulness of the cogroup operator. There are instances where it
can be useful to compute something based on data from one or more stream,
without having to wait for all the streams to produce something for the
group. In the example I gave in the discussion, it is possible to compute
impression/auction statistics without having to wait for click data, which
can typically arrive several minutes late.

We could have a separate discussion around adding inner / outer modifiers
to each of the streams to decide which fields are optional / required
before sending updates if we think that might be useful.



On Tue, May 23, 2017 at 6:28 PM Guozhang Wang <wangguoz@gmail.com> wrote:

> The proposal LGTM, +1
>
> One question I have is about when to send the record to the resulted KTable
> changelog. For example in your code snippet in the wiki page, before you
> see the end result of
>
> 1L, Customer[
>
>                       cart:{Item[no:01], Item[no:03], Item[no:04]},
>                       purchases:{Item[no:07], Item[no:08]},
>                       wishList:{Item[no:11]}
>       ]
>
>
> You will firs see
>
> 1L, Customer[
>
>                       cart:{Item[no:01]},
>                       purchases:{},
>                       wishList:{}
>       ]
>
> 1L, Customer[
>
>                       cart:{Item[no:01]},
>                       purchases:{Item[no:07],Item[no:08]},
>
>                       wishList:{}
>       ]
>
> 1L, Customer[
>
>                       cart:{Item[no:01]},
>                       purchases:{Item[no:07],Item[no:08]},
>
>                       wishList:{}
>       ]
>
> ...
>
>
> I'm wondering if it makes more sense to only start sending the update if
> the corresponding agg-key has seen at least one input from each of the
> input stream? Maybe it is out of the scope of this KIP and we can make it a
> more general discussion in a separate one.
>
>
> Guozhang
>
>
> On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <xavier@confluent.io>
> wrote:
>
> > Hi Kyle, I left a few more comments in the discussion thread, if you
> > wouldn't mind taking a look
> >
> > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <winkelman.kyle@gmail.com
> >
> > wrote:
> >
> > > Hello all,
> > >
> > > I would like to start the vote on KIP-150.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > Kafka-Streams+Cogroup
> > >
> > > Thanks,
> > > Kyle
> > >
> >
>
>
>
> --
> -- Guozhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message