airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject Re: [2.0 spring cleaning] Deprecate adding Operators and Hooks via plugins?
Date Sun, 14 Apr 2019 03:13:52 GMT
+1 for using more entrypoints as discovery mechanism. Maybe we should also
start to use the 'console_scripts' entrypoint group for "airflow" binary
instead of copying the binary airflow script (
https://python-packaging.readthedocs.io/en/latest/command-line-scripts.html#the-console-scripts-entry-point
) as well as for plugginable extensions.

J.

On Sat, Apr 13, 2019 at 11:15 PM Maxime Beauchemin <
maximebeauchemin@gmail.com> wrote:

> I'd say it could be great to deprecate the whole plugin system and use
> Python's "entry points" instead. I just didn't know that was an option and
> the standard way to do this when I originally wrote it... The current
> plugin system is a minefield for circular dependencies...
>
> On Sat, Apr 13, 2019 at 10:50 AM Jarek Potiuk <Jarek.Potiuk@polidea.com>
> wrote:
>
> > One comment here - The current connections.py file screams to get dynamic
> > discovery similar to what sqlalchemy does for dialects (for example).
> >
> > But on the other hand, I am not sure if this "registration" is really
> > needed. Maybe I am totally wrong, but somehow I don't find it
> particularly
> > useful to get hook by connection_type.
> >
> > I believe connection_type is useful for the UI to be able to select the
> > connection when you create it and have different UI fields defined by
> > connection. But nearly universally in other places, when operator is
> > instantiated, it also instantiates the hook it expects to use, rather
> than
> > rely on get_hook() or parse_from_uri() of the connection. Each operator
> > knows what connection type it needs.
> >
> > I checked it quickly and it seems the "get by connection type" or
> > parse_from_uri functionality is almost exclusively used by tests.
> >
> > Again - maybe I am completely wrong but maybe we should rethink the whole
> > "connection" class and get rid of the need of "registering" new
> connection
> > types at all (or move the relevant stuff to be UI only).
> >
> >
> >
> > On Fri, Apr 12, 2019 at 6:07 PM Daniel Imberman <
> daniel.imberman@gmail.com
> > >
> > wrote:
> >
> > > I think that's the important distinction. This wouldn't be removing
> > custom
> > > hooks and operators, this is just removing the python magic that has
> you
> > > placing them in a plugins folder. I can see the value in having the
> DAGs
> > > being real-time parsed as users might want to make small changes to
> > those,
> > > but operators/hooks should be fairly static once created.
> > >
> > > On Fri, Apr 12, 2019 at 8:53 AM Ash Berlin-Taylor <ash@apache.org>
> > wrote:
> > >
> > > > Can you give an example?
> > > >
> > > > We can still have custom hooks/operators per install, I'm just saying
> > > they
> > > > don't need to be implemented/"registered" with airflow.plugin -- all
> > that
> > > > provides is another way of having that be imported.
> > > >
> > > > -ash
> > > >
> > > > > On 12 Apr 2019, at 16:21, Chen Tong <cixuuz@gmail.com> wrote:
> > > > >
> > > > > IMHO, it's worth to have such dynamic hooks adding ability. I met
> > > issues
> > > > > before that the current hook cannot satisfy the needs. I have to
> > write
> > > my
> > > > > own hook and hacked Connections. If hook is not coupled tightly
> with
> > > > > Connections, it will be easier by switching to another python
> > package.
> > > > >
> > > > > On Fri, Apr 12, 2019 at 11:14 AM Daniel Imberman <
> > > > daniel.imberman@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> I am all for this. The only complexity is ensuring the
> operator/hook
> > > > >> exists in the workers too, but overall highly for.
> > > > >> On Apr 12, 2019, 7:37 AM -0700, James Meickle <
> > > jmeickle@quantopian.com
> > > > .invalid>,
> > > > >> wrote:
> > > > >>> YES - I strongly agree with this! I first did it this way
> because I
> > > > >> wanted
> > > > >>> to follow the instructions, assuming there was some Airflow
> magic,
> > > and
> > > > >>> later found it really frustrating to maintain. We should
be clear
> > > that
> > > > >>> standard Python packaging is the way to go.
> > > > >>>
> > > > >>> That being said, what if Airflow had an official Cookiecutter
> > > template
> > > > >> for
> > > > >>> Airflow operator packages, and suggested that as a way to
manage
> > your
> > > > >>> operators in the docs? That would probably help people who
aren't
> > as
> > > > >>> familiar with Python, and over time might increase quality
of
> third
> > > > party
> > > > >>> operators (if it ships by default with linting/etc.)
> > > > >>>
> > > > >>> On Fri, Apr 12, 2019 at 3:54 AM Ash Berlin-Taylor <
> ash@apache.org>
> > > > >> wrote:
> > > > >>>
> > > > >>>> This is something I've said a number of times (but possibly
> never
> > on
> > > > >> the
> > > > >>>> list):
> > > > >>>>
> > > > >>>> I think we should deprecate adding Operators and Hooks
via the
> > > Airflow
> > > > >>>> plugin mechanism.
> > > > >>>>
> > > > >>>> I think plugins should be reserved for any mechanism
that a
> > plain-ol
> > > > >>>> python module import won't work for (which is basically
anything
> > > that
> > > > >> needs
> > > > >>>> to tie deeply in to the Webserver or Scheduler process).
> > > > >>>>
> > > > >>>> To that endI think we should deprecate adding operators
via
> > plugins:
> > > > >>>>
> > > > >>>> from airflow.operators.my_plugin import MyOperator
> > > > >>>>
> > > > >>>> can become
> > > > >>>>
> > > > >>>> from my_plugin import MyOperator
> > > > >>>>
> > > > >>>> with no impact on functionality.
> > > > >>>>
> > > > >>>> The same could be said for hooks, with one possible feature
> > > addition:
> > > > >>>> Right now the list of connection types in the Connections
screen
> > is
> > > > >>>> hard-coded. Is it possible worth making that dynamic
somehow
> based
> > > on
> > > > >> the
> > > > >>>> Hook classes that are loaded? i.e. adding the ability
for hooks
> > > added
> > > > >> in
> > > > >>>> plguins to add items to that list?
> > > > >>>>
> > > > >>>> -ash
> > > > >>
> > > >
> > > >
> > >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > E: jarek.potiuk@polidea.com
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
E: jarek.potiuk@polidea.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message