airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ahuynh@symphonyrm.com"<ahu...@symphonyrm.com>
Subject Ignore Processing DAG Definition Python Files for Paused DAGs
Date Mon, 27 Nov 2017 19:11:37 GMT
Hi all,

I wanted to gauge community interest in this idea we have. We are currently running a modified
version of Airflow 1.9 RC3 where we ignore processing DAG definition Python files for paused
DAGs. By default, list_py_file_paths traverses the dags subdirectory to look for Python files,
and the scheduler processes all these files, regardless of whether the DAGs defined in these
files are paused or not. Our proposed
modification was to query the fileloc column in the dag table, filtering on is_paused=1 and
is_active=1 to get a list of file paths for paused DAGs. Then, we can exclude these files
from the known_file_paths, so that the scheduler does not process these files. This feature
can be set on and off via a scheduler config variable.

If anyone is interested, we already have the code written, so we'd be happy to package up
our changes and create a PR.

Thanks!
-Andy

Mime
View raw message