airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Potiuk <Jarek.Pot...@polidea.com>
Subject Re: Outage report
Date Mon, 19 Aug 2019 23:59:49 GMT
Fantastic reading. I love these kind of detailed analysis with real-life
problems :).

I am myself guilty of some of the hook instantiations in the constructors
of some of the operators :(. When I see such problems I always think "What
kind of system improvement we can do to avoid such problems in the future"
...
I thought that we might want to do some .... yes ... linting ... or more
precisely use https://github.com/davidfraser/pyan - to analyse call graphs
in Airflow and detect such problems in (yes you guessed it) in a pre-commit
hook.

I think it should be rather easy to look at all the operators and check
that none of the classes in hooks packages are instantiated in init() .
methods of the operators.

I am currently on vacations, so no time to do any serious look at it/POC
but maybe someone could take a look and see if we can have something like
that in place :)

J,

On Sun, Aug 18, 2019 at 4:17 AM Kamil Breguła <kamil.bregula@polidea.com>
wrote:

> Hi
>
> This problem also exists in GCP operators. I have noticed this problem long
> time ago and I will want to solve it
> https://issues.apache.org/jira/browse/AIRFLOW-4771
> This problem limits the use of AIrflow in the multitenant
> environment, because the scheduler connects to the connection table.
>
> Greets
>
> On Sat, Aug 17, 2019, 11:17 AM Bas Harenslak <
> basharenslak@godatadriven.com>
> wrote:
>
> > Nice work! Always love reading these sort of “bug reports from hell” and
> > the work required to find the cause.
> >
> > Also strongly agree we should standardize hooks in some way.
> >
> > Cheers,
> > Bas
> >
> > > On 16 Aug 2019, at 17:52, Shaw, Damian P. <
> > damian.shaw.2@credit-suisse.com> wrote:
> > >
> > > Thanks, this is really useful to know!  I often write my own
> > Operators/Sensors/Hooks and was just looking at doing the same with the
> > SFTPSensor and Operator.
> > >
> > > I've never formalized it but my current pattern is the follow:
> > >
> > > Hooks,
> > > Set self._conn to None on __init__, and have a property "self.conn"
> that
> > checks if "self._conn" is None,
> > > *if None create a new connection set it to self._conn and return it
> > > * if not None run a check to see if the connection is still alive, if
> is
> > alive return self._conn, otherwise create a new connection
> > >
> > > Sensor/Operators,
> > > On  __init__ set self.conn_id to the conn_id string, and set
> > "self._{conn_type}_hook" to None and have a property
> "self.{conn_type}_hook"
> > > In property check if "self._{conn_type}_hook" is None and if so create
> a
> > new Hook, if not None then return "self._{conn_type}_hook"
> > >
> > > I would be really appreciative on any  best practices here others could
> > share.
> > >
> > >
> > > -----Original Message-----
> > > From: James Meickle [mailto:jmeickle@quantopian.com.INVALID]
> > > Sent: Friday, August 16, 2019 11:27 AM
> > > To: dev@airflow.apache.org
> > > Subject: Outage report
> > >
> > > We had an outage last night that was rather complex and difficult to
> > debug.
> > > Rather than just writing up the bug, I included what we did for various
> > > debug steps. Hope some folks who are also cluster maintainers may find
> it
> > > interesting!
> > >
> > > https://issues.apache.org/jira/browse/AIRFLOW-5238
> > >
> > >
> > >
> > >
> >
> ===============================================================================
> >
> > > Please access the attached hyperlink for an important electronic
> > communications disclaimer:
> > > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> > >
> >
> ===============================================================================
> >
> >
> >
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message