airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alek Storm <alek.st...@gmail.com>
Subject Re: Ignore Processing DAG Definition Python Files for Paused DAGs
Date Mon, 27 Nov 2017 19:23:23 GMT
What's the advantage of this change? Performance?

Alek

On Mon, Nov 27, 2017 at 1:11 PM, ahuynh@symphonyrm.com <
ahuynh@symphonyrm.com> wrote:

> Hi all,
>
> I wanted to gauge community interest in this idea we have. We are
> currently running a modified version of Airflow 1.9 RC3 where we ignore
> processing DAG definition Python files for paused DAGs. By default,
> list_py_file_paths traverses the dags subdirectory to look for Python
> files, and the scheduler processes all these files, regardless of whether
> the DAGs defined in these files are paused or not. Our proposed
> modification was to query the fileloc column in the dag table, filtering
> on is_paused=1 and is_active=1 to get a list of file paths for paused DAGs.
> Then, we can exclude these files from the known_file_paths, so that the
> scheduler does not process these files. This feature can be set on and off
> via a scheduler config variable.
>
> If anyone is interested, we already have the code written, so we'd be
> happy to package up our changes and create a PR.
>
> Thanks!
> -Andy
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message