kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeyhun Karimov <je.kari...@gmail.com>
Subject Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams
Date Sun, 28 May 2017 17:26:23 GMT
After your response on KIP-149 related with ValueTransformerSupplier,
everything
you mentioned now makes complete sense. Thanks for clarification.

Just a note: We will have additional (to KIP-149) overloaded methods: for
each withKey and withoutKey methods (ValueMapper and ValueMapperWithKey) we
will have overloaded methods with RecordContext argument.
Other than this issue, I don't see any limitation.

Cheers,
Jeyhun


On Sun, May 28, 2017 at 6:34 PM Matthias J. Sax <matthias@confluent.io>
wrote:

> Thanks for you comments Jeyhun,
>
> I agree about the disadvantages. Only the punctuation part is something
> I don't buy. IMHO, RichFunctions should not allow to register and use
> punctuation. If you need punctuation, you should use #transform() or
> similar. Note, that we plan to provide `RecordContext` and not
> `ProcessorContext` and thus, it's not even possible to register
> punctuations.
>
> One more thought: if you go with `init()` and `close()` we basically
> allow users to have an in-memory state for a function. Thus, we cannot
> share a single instance of RichValueMapper (etc) over multiple tasks and
> we would need a supplier pattern similar to #transform(). And this would
> "break the flow" of the API, as (Rich)ValueMapperSupplier would not
> inherit from ValueMapper and thus we would need many new overload for
> KStream/KTable classes.
>
> The overall goal of RichFunction (from my understanding) was to provide
> record metadata information (like offset, timestamp, etc) to the user.
> And we still have #transform() that provided the init and close
> functionality. So if we introduce those with RichFunction we are quite
> close to what #transform provides, and thus it feels as if we duplicate
> functionality.
>
> For this reason, it seems to be better to got with the
> `#valueMapper(ValueMapper mapper, RecordContext context)` approach.
>
> WDYT?
>
>
>
> -Matthias
>
> On 5/27/17 11:00 AM, Jeyhun Karimov wrote:
> > Hi,
> >
> > Thanks for your comments. I will refer the overall approach as rich
> > functions until we find a better name.
> >
> > I think there are some pros and cons of the approach you described.
> >
> > Pros is that it is simple, has clear boundaries, avoids misunderstanding
> of
> > term "function".
> > So you propose sth like:
> > KStream.valueMapper (ValueMapper vm, RecordContext rc)
> > or
> > having rich functions with only a single init(RecordContext rc) method.
> >
> > Cons is that:
> >  - This will bring another set of overloads (if we use RecordContext as a
> > separate parameter). We should consider that the rich functions will be
> for
> > all main interfaces.
> >  - I don't think that we need lambdas in rich functions. It is by
> > definition "rich" so, no single method in interface -> as a result no
> > lambdas.
> >  - I disagree that rich functions should only contain init() method. This
> > depends on each interface. For example, for specific interfaces  we can
> add
> > methods (like punctuate()) to their rich functions.
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> > On Thu, May 25, 2017 at 1:02 AM Matthias J. Sax <matthias@confluent.io>
> > wrote:
> >
> >> I confess, the term is borrowed from Flink :)
> >>
> >> Personally, I never thought about it, but I tend to agree with Michal. I
> >> also want to clarify, that the main purpose is the ability to access
> >> record metadata. Thus, it might even be sufficient to only have "init".
> >>
> >> An alternative would of course be, to pass in the RecordContext as
> >> method parameter. This would allow us to drop "init()". This might even
> >> allow to use Lambdas and we could keep the name RichFunction as we
> >> preserve the nature of being a function.
> >>
> >>
> >> -Matthias
> >>
> >> On 5/24/17 12:13 PM, Jeyhun Karimov wrote:
> >>> Hi Michal,
> >>>
> >>> Thanks for your comments. I see your point and I agree with it.
> However,
> >>> I don't have a better idea for naming. I checked MR source code. There
> >>> it is used JobConfigurable and Closable, two different interfaces.
> Maybe
> >>> we can rename RichFunction as Configurable?
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>> On Tue, May 23, 2017 at 2:58 PM Michal Borowiecki
> >>> <michal.borowiecki@openbet.com <mailto:michal.borowiecki@openbet.com>>
> >>> wrote:
> >>>
> >>>     Hi Jeyhun,
> >>>
> >>>     I understand your argument about "Rich" in RichFunctions. Perhaps
> >>>     I'm just being too puritan here, but let me ask this anyway:
> >>>
> >>>     What is it that makes something a function? To me a function is
> >>>     something that takes zero or more arguments and possibly returns a
> >>>     value and while it may have side-effects (as opposed to "pure
> >>>     functions" which can't), it doesn't have any life-cycle of its own.
> >>>     This is what, in my mind, distinguishes the concept of a "function"
> >>>     from that of more vaguely defined concepts.
> >>>
> >>>     So if we add a life-cycle to a function, in that understanding, it
> >>>     doesn't become a rich function but instead stops being a function
> >>>     altogether.
> >>>
> >>>     You could say it's "just semantics" but to me precise use of
> >>>     language in the given context is an important foundation for good
> >>>     engineering. And in the context of programming "function" has a
> >>>     precise meaning. Of course we can say that in the context of Kafka
> >>>     Streams "function" has a different, looser meaning but I'd argue
> >>>     that won't do anyone any good.
> >>>
> >>>     On the other hand other frameworks such as Flink use this
> >>>     terminology, so it could be that consistency is the reason. I'm
> >>>     guessing that's why the name was proposed in the first place. My
> >>>     point is simply that it's a poor choice of wording and Kafka
> Streams
> >>>     don't have to follow that to the letter.
> >>>
> >>>     Cheers,
> >>>
> >>>     Michal
> >>>
> >>>
> >>>     On 23/05/17 13:26, Jeyhun Karimov wrote:
> >>>>     Hi Michal,
> >>>>
> >>>>     Thanks for your comments.
> >>>>
> >>>>
> >>>>         To me at least it feels strange that something is called a
> >>>>         function yet doesn't follow the functional interface
> >>>>         definition of having just one abstract method. I suppose init
> >>>>         and close could be made default methods with empty bodies once
> >>>>         Java 7 support is dropped to mitigate that concern. Still, I
> >>>>         feel some resistance to consider something that requires
> >>>>         initialisation and closing (which implies holding state) as
> >>>>         being a function. Sounds more like the Processor/Transformer
> >>>>         kind of thing semantically, rather than a function.
> >>>>
> >>>>
> >>>>      -  If we called the interface name only Function your assumptions
> >>>>     will hold. However, the keyword Rich by definition implies that
we
> >>>>     have a function (as you described, with one abstract method and
> >>>>     etc) but it is rich. So, there are multiple methods in it.
> >>>>     Ideally it should be:
> >>>>
> >>>>     public interface RichFunction extends Function {          // this
> >>>>     is the Function that you described
> >>>>       void close();
> >>>>       void init(Some params);
> >>>>        ...
> >>>>     }
> >>>>
> >>>>
> >>>>         The KIP says there are multiple use-cases for this but doesn't
> >>>>         enumerate any - I think some examples would be useful,
> >>>>         otherwise that section sounds a little bit vague.
> >>>>
> >>>>
> >>>>     I thought it is obvious by definition but I will update it.
> Thanks.
> >>>>
> >>>>
> >>>>         IMHO, it's the access to the RecordContext is where the added
> >>>>         value lies but maybe I'm just lacking in imagination, so I'm
> >>>>         asking all this to better understand the rationale for init()
> >>>>         and close().
> >>>>
> >>>>
> >>>>     Maybe I should add some examples. Thanks.
> >>>>
> >>>>
> >>>>     Cheers,
> >>>>     Jeyhun
> >>>>
> >>>>     On Mon, May 22, 2017 at 11:02 AM, Michal Borowiecki
> >>>>     <michal.borowiecki@openbet.com
> >>>>     <mailto:michal.borowiecki@openbet.com>> wrote:
> >>>>
> >>>>         Hi Jeyhun,
> >>>>
> >>>>         I'd like to understand better the premise of RichFunctions and
> >>>>         why |init(Some params)|,| close() |are said to be needed.
> >>>>
> >>>>         To me at least it feels strange that something is called a
> >>>>         function yet doesn't follow the functional interface
> >>>>         definition of having just one abstract method. I suppose init
> >>>>         and close could be made default methods with empty bodies once
> >>>>         Java 7 support is dropped to mitigate that concern. Still, I
> >>>>         feel some resistance to consider something that requires
> >>>>         initialisation and closing (which implies holding state) as
> >>>>         being a function. Sounds more like the Processor/Transformer
> >>>>         kind of thing semantically, rather than a function.
> >>>>
> >>>>         The KIP says there are multiple use-cases for this but doesn't
> >>>>         enumerate any - I think some examples would be useful,
> >>>>         otherwise that section sounds a little bit vague.
> >>>>
> >>>>         IMHO, it's the access to the RecordContext is where the added
> >>>>         value lies but maybe I'm just lacking in imagination, so I'm
> >>>>         asking all this to better understand the rationale for init()
> >>>>         and close().
> >>>>
> >>>>         Thanks,
> >>>>         MichaƂ
> >>>>
> >>>>         On 20/05/17 17:05, Jeyhun Karimov wrote:
> >>>>>         Dear community,
> >>>>>
> >>>>>         As we discussed in KIP-149 [DISCUSS] thread [1], I would
like
> >> to initiate
> >>>>>         KIP for rich functions (interfaces) [2].
> >>>>>         I would like to get your comments.
> >>>>>
> >>>>>
> >>>>>         [1]
> >>>>>
> >>
> http://search-hadoop.com/m/Kafka/uyzND1PMjdk2CslH12?subj=Re+DISCUSS+KIP+149+Enabling+key+access+in+ValueTransformer+ValueMapper+and+ValueJoiner
> >>>>>         [2]
> >>>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-159%3A+Introducing+Rich+functions+to+Streams
> >>>>>
> >>>>>
> >>>>>         Cheers,
> >>>>>         Jeyhun
> >>>>         --
> >>>>         <http://www.openbet.com/>    Michal Borowiecki
> >>>>         Senior Software Engineer L4
> >>>>              T:      +44 208 742 1600 <+44%2020%208742%201600>
> <+44%2020%208742%201600>
> >> <tel:+44%2020%208742%201600>
> >>>>                      +44 203 249 8448 <+44%2020%203249%208448>
> <+44%2020%203249%208448>
> >> <tel:+44%2020%203249%208448>
> >>>>
> >>>>              E:      michal.borowiecki@openbet.com
> >>>>         <mailto:michal.borowiecki@openbet.com>
> >>>>              W:      www.openbet.com <http://www.openbet.com/>
> >>>>
> >>>>
> >>>>              OpenBet Ltd
> >>>>              Chiswick Park Building 9
> >>>>              566 Chiswick High Rd
> >>>>              London
> >>>>              W4 5XT
> >>>>              UK
> >>>>
> >>>>
> >>>>         <https://www.openbet.com/email_promo>
> >>>>
> >>>>         This message is confidential and intended only for the
> >>>>         addressee. If you have received this message in error, please
> >>>>         immediately notify the postmaster@openbet.com
> >>>>         <mailto:postmaster@openbet.com> and delete it from your
> system
> >>>>         as well as any copies. The content of e-mails as well as
> >>>>         traffic data may be monitored by OpenBet for employment and
> >>>>         security purposes. To protect the environment please do not
> >>>>         print this e-mail unless necessary. OpenBet Ltd. Registered
> >>>>         Office: Chiswick Park Building 9, 566 Chiswick High Road,
> >>>>         London, W4 5XT, United Kingdom. A company registered in
> >>>>         England and Wales. Registered no. 3134634. VAT no. GB927523612
> >>>>
> >>>     --
> >>>     <http://www.openbet.com/>         Michal Borowiecki
> >>>     Senior Software Engineer L4
> >>>       T:      +44 208 742 1600 <+44%2020%208742%201600>
> <+44%2020%208742%201600>
> >> <tel:+44%2020%208742%201600>
> >>>               +44 203 249 8448 <+44%2020%203249%208448>
> <+44%2020%203249%208448>
> >> <tel:+44%2020%203249%208448>
> >>>
> >>>       E:      michal.borowiecki@openbet.com
> >>>     <mailto:michal.borowiecki@openbet.com>
> >>>       W:      www.openbet.com <http://www.openbet.com/>
> >>>
> >>>
> >>>       OpenBet Ltd
> >>>       Chiswick Park Building 9
> >>>       566 Chiswick High Rd
> >>>       London
> >>>       W4 5XT
> >>>       UK
> >>>
> >>>
> >>>     <https://www.openbet.com/email_promo>
> >>>
> >>>     This message is confidential and intended only for the addressee.
> If
> >>>     you have received this message in error, please immediately notify
> >>>     the postmaster@openbet.com <mailto:postmaster@openbet.com> and
> >>>     delete it from your system as well as any copies. The content of
> >>>     e-mails as well as traffic data may be monitored by OpenBet for
> >>>     employment and security purposes. To protect the environment please
> >>>     do not print this e-mail unless necessary. OpenBet Ltd. Registered
> >>>     Office: Chiswick Park Building 9, 566 Chiswick High Road, London,
> W4
> >>>     5XT, United Kingdom. A company registered in England and Wales.
> >>>     Registered no. 3134634. VAT no. GB927523612
> >>>
> >>> --
> >>> -Cheers
> >>>
> >>> Jeyhun
> >>
> >> --
> > -Cheers
> >
> > Jeyhun
> >
>
> --
-Cheers

Jeyhun

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message