flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: [DISCUSS] Remove Combinable Annotation from DataSet API
Date Fri, 15 Jan 2016 10:11:51 GMT
OK, I think we got a clear picture here.

I'll update the corresponding JIRA issue FLINK-1045.

Thanks, Fabian

2016-01-14 18:53 GMT+01:00 Henry Saputra <henry.saputra@gmail.com>:

> +1 for approach #1
>
>
>
> - Henry
>
> On Thu, Jan 14, 2016 at 5:35 AM, Chiwan Park <chiwanpark@apache.org>
> wrote:
>
> > I’m also for approach #1. Now is nice time to apply some API-breaking
> > changes.
> >
> > > On Jan 14, 2016, at 1:19 AM, Aljoscha Krettek <aljoscha@apache.org>
> > wrote:
> > >
> > > I’m also for Approach #1. I like simplifying things.
> > >> On 13 Jan 2016, at 14:25, Vasiliki Kalavri <vasilikikalavri@gmail.com
> >
> > wrote:
> > >>
> > >> Hi,
> > >>
> > >> ​+1 for removing the Combinable annotation​. Approach 1 sounds like
> the
> > >> best option to me.
> > >>
> > >>
> > >> On 13 January 2016 at 14:11, Till Rohrmann <trohrmann@apache.org>
> > wrote:
> > >>
> > >>> Hi Fabian,
> > >>>
> > >>> thanks for bringing this issue up. I agree with you that it would be
> > nice
> > >>> to remove the Combinable annotation if it is not really needed. Now
> if
> > >>> people are not aware of the Combinable interface then they might
> think
> > that
> > >>> they are actually using combiners by simply implementing
> > CombineFunction.
> > >>>
> > >>>
> > >> ​has happened to me :S​
> > >>
> > >>
> > >>
> > >>> I would also be in favour of approach 1 combined with a migration
> guide
> > >>> where we highlight possible problems with a faulty combine
> > implementation.
> > >>>
> > >>>
> > >> Migration guide is a ​good idea​!
> > >>
> > >>
> > >>
> > >>> Cheers,
> > >>> Till
> > >>> ​
> > >>>
> > >>>
> > >> ​-Vasia.​
> > >>
> > >>
> > >>
> > >>> On Wed, Jan 13, 2016 at 1:53 PM, Fabian Hueske <fhueske@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi everybody,
> > >>>>
> > >>>>
> > >>>>
> > >>>> As part of the upcoming 1.0 release we are stabilizing Flink's
APIs.
> > >>>>
> > >>>> I would like to start a discussion about removing the Combinable
> > >>> annotation
> > >>>> from the DataSet API.
> > >>>>
> > >>>>
> > >>>>
> > >>>> # The Current State:
> > >>>>
> > >>>> The DataSet API offers two interface for combine functions,
> > >>> CombineFunction
> > >>>> and GroupCombineFunction, which can be added to a
> GroupReduceFunctions
> > >>>> (GRF).
> > >>>>
> > >>>>
> > >>>> However, implementing a combine interface does not make a GRF
> > combinable.
> > >>>> In addition, the GRF class needs to be annotated with the Combinable
> > >>>> annotation. The RichGroupReduceFunction has a default implementation
> > of
> > >>>> combine, which forwards the combine parameters to the reduce method.
> > This
> > >>>> default implementation is not used, unless the class that extends
> RGRF
> > >>> has
> > >>>> the Combinable annotation.
> > >>>>
> > >>>>
> > >>>>
> > >>>> In addition to the Combinable annotation, the DataSet API
> > >>>> GroupReduceOperator features the setCombinable(boolean) method
to
> > >>> override
> > >>>> a missing or existing Combinable annotation.
> > >>>>
> > >>>>
> > >>>>
> > >>>> # My Proposal:
> > >>>>
> > >>>> I propose to remove the Combinable annotation because it is not
> > required
> > >>>> and complicates the definition of combiners. Instead, the combine
> > method
> > >>> of
> > >>>> a GroupReduceFunction should be executed if it implements one of
the
> > >>>> combine function interfaces. This would require to remove the
> default
> > >>>> combine implementation of the RichGroupReduceFunction as well.
> > >>>>
> > >>>> This would be an API breaking change and has a few implications.
> > >>>>
> > >>>>
> > >>>>
> > >>>> # Possible Implementation:
> > >>>>
> > >>>> There are a few ways to implement this change.
> > >>>>
> > >>>> - Remove Combinable annotation or mark it deprecated (and keep
> effect)
> > >>>>
> > >>>> - Remove default combine method from RichGroupReduceFunction or
> > deprecate
> > >>>> it
> > >>>>
> > >>>>
> > >>>>
> > >>>> Approach 1:
> > >>>> - Remove Combinable annotation
> > >>>> - Remove default combine() method from RichGroupReduceFunction
> > >>>> - Effect:
> > >>>>   - All RichGroupReduceFunctions that do either have the Combinable
> > >>>> annotation or implement the combine method do not compile anymore
> > >>>>   - GroupReduceFunctions that have the Combinable annotation do
not
> > >>>> compile anymore
> > >>>>   - GroupReduceFunctions that implement a combine interface without
> > >>>> having the annotation (and not being set to combinable during
> program
> > >>>> construction) will execute the previously not executed combine
> method.
> > >>> This
> > >>>> might change the behavior of the program. In worst case, the program
> > >>> might
> > >>>> silently produce wrong results (or crash) if the combiner
> > implementation
> > >>>> was faulty. In best case, the program executes faster.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Approach 2:
> > >>>> - As Approach 1
> > >>>> - In addition extend both combine interfaces with a deprecated
> marker
> > >>>> method. This will ensure that all functions that implement a
> > combinable
> > >>>> interface do not compile anymore and need to be fixed. This could
> > prevent
> > >>>> the silent failure as in Approach 1, but would also cause an
> > additional
> > >>> API
> > >>>> breaking change once the deprecated marker method is removed again.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Approach 3:
> > >>>> - Mark Combinable annotation deprecated
> > >>>> - Mark combine() method in RichGroupReduceFunction as deprecated
> > >>>> - Effect:
> > >>>>   - There'll be a couple of deprecation warnings.
> > >>>>   - We face the same problem with silent failures as in Approach
1.
> > >>>>   - We have to check if RichGroupReduceFunction's override combine
> or
> > >>> not
> > >>>> (can be done with reflection). If the method is not overridden
we do
> > not
> > >>>> execute it (unless there is a Combinable annotation) and we are
> fine.
> > If
> > >>> it
> > >>>> is overridden and no Combinable annotation has been defined, we
have
> > the
> > >>>> same problem with silent failures as before.
> > >>>>   - After we remove the deprecated annotation and method, we have
> the
> > >>>> same effect as with Approach 1.
> > >>>>
> > >>>>
> > >>>>
> > >>>> There are more alternatives, but these are the most viable, IMO.
> > >>>>
> > >>>>
> > >>>>
> > >>>> I think, if we want to remove the combinable annotation, we should
> do
> > it
> > >>>> now.
> > >>>>
> > >>>> Given the three options, would go for Approach 1. Yes, breaks a
lot
> of
> > >>> code
> > >>>> and yes there is the possibility of computing incorrect results.
> > >>> Approach 2
> > >>>> is safer but would mean another API breaking change in the future.
> > >>> Approach
> > >>>> 3 comes with fewer breaking changes but has the same problem of
> silent
> > >>>> failures.
> > >>>>
> > >>>> IMO, the breaking API changes of Approach 1 are even desirable
> because
> > >>> they
> > >>>> will make users aware that this feature changed.
> > >>>>
> > >>>>
> > >>>>
> > >>>> What do you think?
> > >>>>
> > >>>>
> > >>>>
> > >>>> Cheers, Fabian
> > >>>>
> >
> > Regards,
> > Chiwan Park
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message