apex-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Yan <da...@datatorrent.com>
Subject Re: A proposal for Malhar
Date Tue, 12 Jul 2016 18:53:33 GMT
Hi all,

I would like to renew the discussion of retiring operators in Malhar.

As stated before, the reason why we would like to retire operators in
Malhar is because some of them were written a long time ago before Apache
incubation, and they do not pertain to real use cases, are not up to par in
code quality, have no potential for improvement, and probably completely
unused by anybody.

We do not want contributors to use them as a model of their contribution,
or users to use them thinking they are of quality, and then hit a wall.
Both scenarios are not beneficial to the reputation of Apex.

The initial 3 packages that we would like to target are *lib/algo*,
*lib/math*, and *lib/streamquery*.

I'm adding this thread to the users list. Please speak up if you are using
any operator in these 3 packages. We would like to hear from you.

These are the options I can think of for retiring those operators:

1) Completely remove them from the malhar repository.
2) Move them from malhar-library into a separate artifact called malhar-misc
3) Mark them deprecated and add to their javadoc that they are no longer
supported

Note that 2 and 3 are not mutually exclusive. Any thoughts?

David

On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pramod@datatorrent.com>
wrote:

> I wanted to close the loop on this discussion. In general everyone seemed
> to be favorable to this idea with no serious objections. Folks had good
> suggestions like documenting capabilities of operators, come up well
> defined criteria for graduation of operators and what those criteria may be
> and what to do with existing operators that may not yet be mature or
> unused.
>
> I am going to summarize the key points that resulted from the discussion
> and would like to proceed with them.
>
>    - Operators that do not yet provide the key platform capabilities to
>    make an operator useful across different applications such as
> reusability,
>    partitioning static or dynamic, idempotency, exactly once will still be
>    accepted as long as they are functionally correct, have unit tests and
> will
>    go into a separate module.
>    - Contrib module was suggested as a place where new contributions go in
>    that don't yet have all the platform capabilities and are not yet
> mature.
>    If there are no other suggestions we will go with this one.
>    - It was suggested the operators documentation list those platform
>    capabilities it currently provides from the list above. I will document
> a
>    structure for this in the contribution guidelines.
>    - Folks wanted to know what would be the criteria to graduate an
>    operator to the big leagues :). I will kick-off a separate thread for
> it as
>    I think it requires its own discussion and hopefully we can come up
> with a
>    set of guidelines for it.
>    - David brought up state of some of the existing operators and their
>    retirement and the layout of operators in Malhar in general and how it
>    causes problems with development. I will ask him to lead the discussion
> on
>    that.
>
> Thanks
>
> On Fri, May 27, 2016 at 7:47 PM, David Yan <david@datatorrent.com> wrote:
>
> > The two ideas are not conflicting, but rather complementing.
> >
> > On the contrary, putting a new process for people trying to contribute
> > while NOT addressing the old unused subpar operators in the repository is
> > what is conflicting.
> >
> > Keep in mind that when people try to contribute, they always look at the
> > existing operators already in the repository as examples and likely a
> model
> > for their new operators.
> >
> > David
> >
> >
> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <amol@datatorrent.com>
> wrote:
> >
> > > Yes there are two conflicting threads now. The original thread was to
> > open
> > > up a way for contributors to submit code in a dir (contrib?) as long as
> > > license part of taken care of.
> > >
> > > On the thread of removing non-used operators -> How do we know what is
> > > being used?
> > >
> > > Thks,
> > > Amol
> > >
> > >
> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> sandesh@datatorrent.com>
> > > wrote:
> > >
> > > > +1 for removing the not-used operators.
> > > >
> > > > So we are creating a process for operator writers who don't want to
> > > > understand the platform, yet wants to contribute? How big is that
> set?
> > > > If we tell the app-user, here is the code which has not passed all
> the
> > > > checklist, will they be ready to use that in production?
> > > >
> > > > This thread has 2 conflicting forces, reduce the operators and make
> it
> > > easy
> > > > to add more operators.
> > > >
> > > >
> > > >
> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> > pramod@datatorrent.com>
> > > > wrote:
> > > >
> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > > gaurav.gopi123@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Pramod,
> > > > > >
> > > > > > By that logic I would say let's put all partitionable operators
> > into
> > > > one
> > > > > > folder, non-partitionable operators in another and so on...
> > > > > >
> > > > >
> > > > > Remember the original goal of making it easier for new members to
> > > > > contribute and managing those contributions to maturity. It is not
> a
> > > > > functional level separation.
> > > > >
> > > > >
> > > > > > When I look at hadoop code I see these annotations being used
at
> > > class
> > > > > > level and not at package/folder level.
> > > > >
> > > > >
> > > > > I had a typo in my email, I meant to say "think of this like a
> > > folder..."
> > > > > as an analogy and not literally.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > > pramod@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > > gaurav.gopi123@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Can same goal not be achieved by
> > > > > > > > using
> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > > /
> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > > annotation?
> > > > > > > >
> > > > > > >
> > > > > > > I think it is important to localize the additions in one
place
> so
> > > > that
> > > > > it
> > > > > > > becomes clearer to users about the maturity level of these,
> > easier
> > > > for
> > > > > > > developers to track them towards the path to maturity and
also
> > > > > provides a
> > > > > > > clearer directive for committers and contributors on acceptance
> > of
> > > > new
> > > > > > > submissions. Relying on the annotations alone makes them
spread
> > all
> > > > > over
> > > > > > > the place and adds an additional layer of difficulty in
> > > > identification
> > > > > > not
> > > > > > > just for users but also for developers who want to find
such
> > > > operators
> > > > > > and
> > > > > > > improve them. This of this like a folder level annotation
where
> > > > > > everything
> > > > > > > under this folder is unstable or evolving.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> > > david@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Malhar in its current state, has
way too many
> operators
> > > > that
> > > > > > fall
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > "non-production quality" category.
We should make it
> > > > obvious
> > > > > to
> > > > > > > > users
> > > > > > > > > > > that
> > > > > > > > > > > > which operators are up to par,
and which operators
> are
> > > not,
> > > > > and
> > > > > > > > maybe
> > > > > > > > > > > even
> > > > > > > > > > > > remove those that are likely not
ever used in a real
> > use
> > > > > case.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I am ambivalent about revisiting older
operators and
> > doing
> > > > this
> > > > > > > > > exercise
> > > > > > > > > > as
> > > > > > > > > > > this can cause unnecessary tensions.
My original intent
> > is
> > > > for
> > > > > > > > > > > contributions going forward.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > IMO it is important to address this as well.
Operators
> > > outside
> > > > > the
> > > > > > > play
> > > > > > > > > > area should be of well known quality.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > I think this is important, and I don't anticipate
much
> > tension
> > > if
> > > > > we
> > > > > > > > > establish clear criteria.
> > > > > > > > > It's not helpful if we let the old subpar operators
stay
> and
> > > put
> > > > up
> > > > > > the
> > > > > > > > > bars for new operators.
> > > > > > > > >
> > > > > > > > > David
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
View raw message