airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bence Nagy (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-163) Running multiple LocalExecutor schedulers makes system load skyrocket
Date Mon, 23 May 2016 09:23:12 GMT
Bence Nagy created AIRFLOW-163:
----------------------------------

             Summary: Running multiple LocalExecutor schedulers makes system load skyrocket
                 Key: AIRFLOW-163
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-163
             Project: Apache Airflow
          Issue Type: Bug
    Affects Versions: Airflow 1.7.1
         Environment: EC2 t2.medium instance, 
Docker `version 1.11.1, build 5604cbe`, 
Host is `Linux ip-172-31-44-140 3.13.0-85-generic #129-Ubuntu SMP Thu Mar 17 20:50:15 UTC
2016 x86_64 x86_64 x86_64 GNU/Linux`, 
Docker containers are built upon the `python:3.5` image, 
LocalExecutor is used with two scheduler containers running
            Reporter: Bence Nagy
            Priority: Minor


I've been told on Gitter that this is expected currently, but thought I'd create an issue
for it anyway.

See this screenshot of a task duration chart — I launched a second scheduler for the 8:50
execution. The orange line represents a PostgresOperator task (i.e. processing happens independent
of airflow), while the other lines represent data copying tasks that go through a temp file
on the airflow host https://i.imgur.com/2tDKgKj.png

I'm seeing a system load of around 4.0-5.0 when processing tasks when one scheduler is running,
and 20.0-30.0 with two.

Running {{airflow scheduler --num_runs 3}} under yappi got me these results when ordered by
total time: http://pastebin.com/8TiEG4P3. I still have the raw profiling data, let me know
if another data extract would be useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message