airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime Beauchemin <maximebeauche...@gmail.com>
Subject Re: AIP-12 Persist DAG into DB
Date Fri, 01 Feb 2019 06:12:35 GMT
Right, it's been discussed extensively in the past and the main thing
needed to get to a "stateless web server" (or at least a DagBag-free web
server) is to drop the template rendering in the UI. Also we might need
little workarounds (we'd have to dig in to check) around deleting task
instances or force-running tasks, nothing major I think.

Also the scheduler (think of it as a "supervisor", as this specific
workload has nothing to do with scheduling), would need to serialize the
DAGs periodically, likely to the database, so that the web server can get
freshly serialized metadata from the database during the scope of web
requests.

Max

On Thu, Jan 31, 2019 at 9:28 AM Dan Davydov <ddavydov@twitter.com.invalid>
wrote:

> Agreed on complexities (I think deprecating Jinja templates for webserver
> rendering is one thing), but I'm not sure I understand on the falling down
> on code changes part, mind providing an example?
>
> On Thu, Jan 31, 2019 at 12:22 PM Ash Berlin-Taylor <ash@apache.org> wrote:
>
> > That sounds like a good idea at first, but falls down with possible code
> > changes in operators between one task and the next.
> >
> > (I would like this, but there are definite complexities)
> >
> > -ash
> >
> >
> > On 31 January 2019 16:56:54 GMT, Dan Davydov
> <ddavydov@twitter.com.INVALID>
> > wrote:
> > >I feel the right higher-level solution to this problem (which is
> > >"Adding
> > >Consistency to Airflow") is DAG serialization, that is all DAGs should
> > >be
> > >represented as e.g. JSON (similar to the current SimpleDAGBag object
> > >used
> > >by the Scheduler). This solves the webserver issue, and also adds
> > >consistency between Scheduler/Workers (all DAGruns can be ensured to
> > >run at
> > >the same version of a DAG instead of whatever happens to live on the
> > >worker
> > >at the time).
> > >
> > >On Thu, Jan 31, 2019 at 9:44 AM Peter van ‘t Hof <
> > >petervanthof@godatadriven.com> wrote:
> > >
> > >> Hi All,
> > >>
> > >> As most of you guys know, airflow got an issue when loading new dags
> > >where
> > >> the webserver sometimes sees it and sometimes not.
> > >> Because of this we did wrote this AIP to solve this issue:
> > >>
> > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-12+Persist+DAG+into+DB
> > >>
> > >> Any feedback is welcome.
> > >>
> > >> Gr,
> > >> Peter van 't Hof
> > >> Big Data Engineer
> > >>
> > >> GoDataDriven
> > >> Wibautstraat 202
> > >> 1091 GS Amsterdam
> > >> https://godatadriven.com
> > >>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message