airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alek Storm <>
Subject Re: Ignore Processing DAG Definition Python Files for Paused DAGs
Date Mon, 27 Nov 2017 19:23:23 GMT
What's the advantage of this change? Performance?


On Mon, Nov 27, 2017 at 1:11 PM, <> wrote:

> Hi all,
> I wanted to gauge community interest in this idea we have. We are
> currently running a modified version of Airflow 1.9 RC3 where we ignore
> processing DAG definition Python files for paused DAGs. By default,
> list_py_file_paths traverses the dags subdirectory to look for Python
> files, and the scheduler processes all these files, regardless of whether
> the DAGs defined in these files are paused or not. Our proposed
> modification was to query the fileloc column in the dag table, filtering
> on is_paused=1 and is_active=1 to get a list of file paths for paused DAGs.
> Then, we can exclude these files from the known_file_paths, so that the
> scheduler does not process these files. This feature can be set on and off
> via a scheduler config variable.
> If anyone is interested, we already have the code written, so we'd be
> happy to package up our changes and create a PR.
> Thanks!
> -Andy

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message