airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Driesprong, Fokko" <fo...@driesprong.frl>
Subject Re: [2.0 spring cleaning] Deprecate contrib folder?
Date Sun, 14 Apr 2019 19:20:46 GMT
I'm in favor of removing the Contrib folder. It doesn't really add value in
my opinion, and moving the hooks/operators will break the import. While
DAG'ing I always have to look up if the operator is in contrib or not.

Also, I think we should keep the operators and hooks part of the Airflow
package. Having this separately will make the testing of the
hooks/operators much more complicated. That being said, I do think we need
to have more people on the project that "own" certain operators. Maybe keep
a list of the authors as well (or reintroduce the Mention-bot
https://github.com/facebookarchive/mention-bot ? Loved that bot).

Cheers, Fokko


Op za 13 apr. 2019 om 12:36 schreef Jarek Potiuk <Jarek.Potiuk@polidea.com>:

> I think there are quite a few contrib parts that are at least on-par with
> regards to code quality,  testing and especially documentation.
>
> And yes, among those are GCP operators we developed which are not only unit
> but also system-tested and we put quite an effort into making documentation
> really useful and well structured ;).
>
> I'd rather move those "graduated" operators/hooks to core and maybe rename
> the "contrib" folder to "incubator" or something like that to indicate that
> those operators are not yet "core-quality" but aspire to become one. That
> would make a nice "intro" task for new contributors - to improve one of the
> incubating operators to become "core-ready".
>
> I am quite sceptical myself about AIP-8 and separating out the hooks and
> operators. There were already few discussions about that, but splitting the
> operators out might be quite difficult and it will only be possible if
> there is some way to quickly test compatibility of those split operators
> with various versions of Airflow and set of dependent requirements..
> Otherwise it will very quickly become a mess - nobody will know which
> version of Airflow is needed to run which operators and there will be
> problems if someone will try to run different operators with different
> requirements in the same DAG (and different versions of airflow).
>
> Until we manage to isolate operators within the same DAG to potentially use
> different dependencies, this is straight road to dependency-hell.
>
> One solution to that that I have in mind for some time (but this is very
> long term) might be to make Airflow Docker-native and run every operator
> within it's own separate Docker instance with it's own dependencies. That
> would be quite possible to do (we would need to split operators into very
> light "proxy" (basically current  __init__() - the part that is executed
> within DAG scanning) and heavy "execute" parts (where the operator's
> execute-related methods would be run in separate Docker on workers).
>
> J.
>
> On Sat, Apr 13, 2019 at 11:59 AM Felix Uellendall <felix.uellendall@gmx.de
> >
> wrote:
>
> > +1 on deprecating the contrib folder.
> >
> > Bolk de Bruin the reason the core hooks and operators are properly
> > tested because, for example I added some more tests to it and I am
> > "only" a contributor.
> >
> > So do you really want to split up contributors work and core committers
> > work? I personally think this is not the right way to go.
> >
> > It is true that the contrib hooks and operators have not the same level
> > of code quality but we can do something about it. I am trying to improve
> > our test coverage overall and add missing tests.
> >
> > I don't think an extra step is needed here where we first move properly
> > tested ones into the core package and then moving new ones from time to
> > time into it. Wouldn't that mean that we think the code quality of
> > "contrib" (contributor) in general is worse than the code quality of
> > committers? Every new contributor who comes along this project would
> > think that, wouldn't he?
> >
> > -feluelle
> >
> > Am 13/04/2019 um 07:51 schrieb Beau Barker:
> > > A separate airflow-contrib repo, on a separate release cadence would be
> > my preference.
> > >
> > >
> > >> On 12 Apr 2019, at 11:17 pm, Julian De Ruiter <
> > julianderuiter@godatadriven.com> wrote:
> > >>
> > >> Isn’t this in contradiction with AIP-8, which is aimed at removing
> > operators/hooks from the core Airflow package?
> > >>
> > >> Personally I would rather remove hooks/operators from Airflow than add
> > even more to the Airflow core. This counts double for the contrib stuff,
> > which is often poorly designed and/or tested.
> > >>
> > >> Best,
> > >> Julian
> > >>
> > >>> On 12 Apr 2019, at 10:23, Bolke de Bruin <bdbruin@gmail.com>
wrote:
> > >>>
> > >>> That’s perfectly fine to me.
> > >>>
> > >>> Verstuurd vanaf mijn iPad
> > >>>
> > >>>> Op 12 apr. 2019 om 10:20 heeft Kaxil Naik <kaxilnaik@gmail.com>
het
> > volgende geschreven:
> > >>>>
> > >>>> Ok. How about moving the properly tested and maintained hooks/ops
> from
> > >>>> contrib to core?
> > >>>>
> > >>>>> On Fri, Apr 12, 2019, 09:13 Bolke de Bruin <bdbruin@gmail.com>
> > wrote:
> > >>>>>
> > >>>>> I disagree. Core signals “properly tested” and maintained.
Ie. A
> > kind of
> > >>>>> quality.  I don’t think contrib has that.
> > >>>>>
> > >>>>> Verstuurd vanaf mijn iPad
> > >>>>>
> > >>>>>> Op 12 apr. 2019 om 10:03 heeft Kaxil Naik <kaxilnaik@gmail.com>
> het
> > >>>>> volgende geschreven:
> > >>>>>> Contrib folder was used when it was used at Airbnb. Currently,
it
> > doesn't
> > >>>>>> make any sense and we have equal responsibility to maintain
all
> the
> > >>>>> hooks,
> > >>>>>> operators, sensors in contrib folder as we do for core.
> > >>>>>>
> > >>>>>> I would suggest to remove contrib folder and move all hooks,
ops,
> > and
> > >>>>>> sensors to the core folder.
> > >>>>>>
> > >>>>>> Or reorganize the folder structure similar to what was
discussed
> in
> > a
> > >>>>>> mailing thread few months ago.
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>> Kaxil
> >
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> E: jarek.potiuk@polidea.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message