airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arthur Wiedmer <arthur.wied...@gmail.com>
Subject Re: Cloud Provider grouping into Plugins
Date Mon, 17 Oct 2016 03:30:16 GMT
Sounds reasonable to me. +1

We have been meaning to refactor the AWS operators anyway, we could take
advantage of this to reorganize the repo a little bit.

That makes me think that we have not yet decided of a flow to enable work
towards say 2.0 vs 1.8. We might want to start thinking about this.

Best,
Arthur

On Fri, Oct 14, 2016 at 9:58 AM, Chris Riccomini <criccomini@apache.org>
wrote:

> > So I vote for this, but it will have to be done gently to avoid breaking
> the existing GCP ones.
>
> Same.
>
> On Fri, Oct 14, 2016 at 8:51 AM, Jeremiah Lowin <jlowin@apache.org> wrote:
> > One reason I do like the idea is that especially in contrib, Operators
> are
> > essentially self-documenting and the first clue is just the file name
> > ('my_gcp_operators.py'). Since we no longer greedily import anything, you
> > have to know exactly what file to import to get the functionality you
> want.
> > Grouping them provides a gentler way to figure out what file does what
> > ('GCP/storage_operators.py' vs 'GCP/bigquery_operators.py' vs
> > 'docker_operators.py'). Sure, you could do this by enforcing a common
> name
> > standard ('GCP_storage_operators.py') but submodules mean you can
> > additionally take advantage of the common infrastructure that Alex
> > referenced. I think if we knew how many contrib modules we would have
> today,
> > we would have done this at the outset (even though it would have looked
> like
> > major overkill). Also, the previous import mechanism made importing from
> > submodules really hard; we don't have that issue anymore.
> >
> > So I vote for this, but it will have to be done gently to avoid breaking
> the
> > existing GCP ones.
> >
> > On Fri, Oct 14, 2016 at 11:29 AM Alex Van Boxel <alex@vanboxel.be>
> wrote:
> >>
> >> Talking about AWS, it would only make sense if other people would step
> up
> >> to do it for AWS, and even Azure (or don't we have Azure operators?).
> >>
> >> On Fri, Oct 14, 2016 at 5:25 PM Chris Riccomini <criccomini@apache.org>
> >> wrote:
> >>
> >> > What do others think? I know Sid is a big AWS user.
> >> >
> >> > On Fri, Oct 14, 2016 at 8:24 AM, Chris Riccomini <
> criccomini@apache.org>
> >> > wrote:
> >> > > Ya, if we go the deprecation route, and let them float around for
a
> >> > > release or two, I'm OK with that (or until we bump major to 2.0).
> >> > >
> >> > > Other than that, it sounds like a good opportunity to clean things
> up.
> >> > > :) I do notice a lot of AWS/GCP code (e.g. the S3 Redshift
> operator).
> >> > >
> >> > > On Fri, Oct 14, 2016 at 8:16 AM, Alex Van Boxel <alex@vanboxel.be>
> >> > wrote:
> >> > >> Well, I wouldn't touch the on that exist (maybe we could mark
them
> >> > >> deprecated, but that's all). But I would move (copy) them together
> >> > >> and
> >> > make
> >> > >> them consistent (example, let them all use the same default
> >> > connection_id,
> >> > >> ...). For a new user it's quite confusing I think due to different
> >> > reasons
> >> > >> (style, etc...) you know we have an old ticket: making gcp
> consistent
> >> > >> (I
> >> > >> just don't want to start on this on, on fear of breaking
> something).
> >> > >>
> >> > >> On Fri, Oct 14, 2016 at 4:59 PM Chris Riccomini
> >> > >> <criccomini@apache.org>
> >> > >> wrote:
> >> > >>
> >> > >> Hmm. What advantages would this provide? I'm a little nervous
about
> >> > >> breaking compatibility. We have a bunch of DAGs which import all
> >> > >> kinds
> >> > >> of GCP hooks and operators. Wouldn't want those to move.
> >> > >>
> >> > >> On Fri, Oct 14, 2016 at 7:54 AM, Alex Van Boxel <alex@vanboxel.be>
> >> > wrote:
> >> > >>> Hi all,
> >> > >>>
> >> > >>> I'm starting to write some very exotic Operators that are
a bit
> >> > >>> strange
> >> > >>> adding to contrib. Examples of this are:
> >> > >>>
> >> > >>> + See if a Compute snapshot of a disc is created
> >> > >>> + See if a string appears on the serial port of Compute instance
> >> > >>>
> >> > >>> but they would be a nice addition if we had a Google Compute
> plugin
> >> > >>> (or
> >> > >> any
> >> > >>> other cloud provider, AWS, Azure, ...). I'm not talking about
> >> > >>> getting
> >> > >> cloud
> >> > >>> support out of the main source tree. No, I'm talking about
> grouping
> >> > them
> >> > >>> together in a consistent part. We can even start adding macro's
> etc.
> >> > This
> >> > >>> would be a good opportunity to move all the GCP operators
> together,
> >> > making
> >> > >>> them consistent without braking the existing operators that
exist
> in
> >> > >>> *contrib*.
> >> > >>>
> >> > >>> Here are a few requirements that I think of:
> >> > >>>
> >> > >>>    - separate folder ( example  <airflow>/integration/googlecloud
> ,
> >> > >>> <airflow>/integration/aws
> >> > >>>    ,  <airflow>/integration/azure )
> >> > >>>    - enable in config (don't want to load integrations I don't
> use)
> >> > >>>    - based on Plugin (same interface)
> >> > >>>
> >> > >>> Thoughts?
> >> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message