airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Harpaz (Jira)" <j...@apache.org>
Subject [jira] [Updated] (AIRFLOW-5818) Very bad webserver performance when defining many dags with many operators
Date Sun, 03 Nov 2019 15:44:00 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gary Harpaz updated AIRFLOW-5818:
---------------------------------
    Attachment: Screenshot from 2019-11-03 17-39-43.png

> Very bad webserver performance when defining many dags with many operators
> --------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5818
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5818
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: webserver
>    Affects Versions: 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 1.10.5
>            Reporter: Gary Harpaz
>            Priority: Blocker
>         Attachments: Screenshot from 2019-11-03 17-39-43.png, dup_dags.py, my_dag.template
>
>
> In my scenario I have defined 500 dags, each dag has approximately 1500 operators.
> This makes webserver impossible to work with even when all dags are paused and nothing
is running. The cpu spikes all the time and webserver consumes huge amounts of  memory for
no reason.
> To reproduce this use the attched my_dag.template file and duplicate it using the attached
dup_dags.py script.
>  
> The root cause of this issue is that dagbag will load all dags into memory which takes
huge cpu and memory unnecessarily. 
> I have already fixed this in:
> [https://github.com/gary-harpaz/airflow/tree/improve-performance]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message