airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fokko Driesprong (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (AIRFLOW-1729) Ignore whole directories in .airflowignore
Date Wed, 28 Mar 2018 21:18:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Fokko Driesprong resolved AIRFLOW-1729.
---------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.0

Issue resolved by pull request #3171
[https://github.com/apache/incubator-airflow/pull/3171]

> Ignore whole directories in .airflowignore
> ------------------------------------------
>
>                 Key: AIRFLOW-1729
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1729
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: Airflow 2.0
>            Reporter: Cedric Hourcade
>            Assignee: Kamil Sambor
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> The .airflowignore file allows to prevent scanning files for DAG. But even if we blacklist
fulldirectory the {{os.walk}} will still go through them no matter how deep they are and skip
files one by one, which can be an issue when you keep around big .git or virtualvenv directories.
> I suggest to add something like:
> {code}
> dirs[:] = [d for d in dirs if not any([re.findall(p, os.path.join(root, d)) for p in
patterns])]
> {code}
> to prune the directories here: https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/utils/dag_processing.py#L208-L209
and in {{list_py_file_paths}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message