airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 刘松(Brain++组) <>
Subject Re: About the DAG discovering not synced between scheduler and webserver
Date Sun, 13 May 2018 05:39:40 GMT

It seems that Airflow handles bellow situation currently:

-  DAGs discovered in scheduler, but not discovered by webserver yet

-  DAGs discovered in webserver, but not discovered by scheduler yet

I still don't quite understand why there is the discovering logic separately in scheduler
and webserver, based on my understanding webserver only needs to display the orm_dags from
metadb, is there any requirement or design consideration besides this ?

Many thanks for any information.



From: Song Liu <>
Sent: Saturday, May 12, 2018 7:58:43 PM
Subject: About the DAG discovering not synced between scheduler and webserver


When add a new dag, sometimes we can see:

This DAG isn't available in the web server's DagBag object. It shows up in this list because
the scheduler marked it as active in the metadata database.

In the, it will collect DAGs under "DAGS_FOLDER" by instantiate a DagBag object as

dagbag = models.DagBag(settings.DAGS_FOLDER)

So that webserver will depends on its own timing to collect DAGs, but why not just simply
to query metadata db ? since if a DAG is active in DB now it can be visible in web at the

Could someone share something behind this design ?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message