airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Yang (JIRA)" <>
Subject [jira] [Created] (AIRFLOW-160) Parse DAG files through child processes
Date Mon, 23 May 2016 00:58:12 GMT
Paul Yang created AIRFLOW-160:

             Summary: Parse DAG files through child processes
                 Key: AIRFLOW-160
             Project: Apache Airflow
          Issue Type: Improvement
          Components: scheduler
            Reporter: Paul Yang
            Assignee: Paul Yang

Currently, the Airflow scheduler parses all user DAG files in the same process as the scheduler
itself. We've seen issues in production where bad DAG files cause scheduler to fail. A simple
example is if the user script calls `sys.exit(1)`, the scheduler will exit as well. We've
also seen an unusual case where modules loaded by the user DAG affect operation of the scheduler.
For better uptime, the scheduler should be resistant to these problematic user DAGs.

The proposed solution is to parse and schedule user DAGs through child processes. This way,
the main scheduler process is more isolated from bad DAGs. There's a side benefit as well
- since parsing is distributed among multiple processes, it's possible to parse the DAG files
more frequently, reducing the latency between when a DAG is modified and when the changes
are picked up.

Another issue right now is that all DAGs must be scheduled before any tasks are sent to the
executor. This means that the frequency of task scheduling is limited by the slowest DAG to
schedule. The changes needed for scheduling DAGs through child processes will also make it
easy to decouple this process and allow tasks to be scheduled and sent to the executor in
a more independent fashion. This way, overall scheduling won't be held back by a slow DAG.

This message was sent by Atlassian JIRA

View raw message