airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Iuliia Volkova (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-163) Running multiple LocalExecutor schedulers makes system load skyrocket
Date Fri, 21 Sep 2018 10:32:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623388#comment-16623388
] 

Iuliia Volkova commented on AIRFLOW-163:
----------------------------------------

[~bolke], [~ashb], can we close this task if it was not updated several years? and relative
to 1.7 version?

> Running multiple LocalExecutor schedulers makes system load skyrocket
> ---------------------------------------------------------------------
>
>                 Key: AIRFLOW-163
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-163
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.7.1
>         Environment: EC2 t2.medium instance, 
> Docker `version 1.11.1, build 5604cbe`, 
> Host is `Linux ip-172-31-44-140 3.13.0-85-generic #129-Ubuntu SMP Thu Mar 17 20:50:15
UTC 2016 x86_64 x86_64 x86_64 GNU/Linux`, 
> Docker containers are built upon the `python:3.5` image, 
> LocalExecutor is used with two scheduler containers running
>            Reporter: Bence Nagy
>            Priority: Minor
>              Labels: scheduler
>
> I've been told on Gitter that this is expected currently, but thought I'd create an issue
for it anyway.
> See this screenshot of a task duration chart — I launched a second scheduler for the
8:50 execution. The orange line represents a PostgresOperator task (i.e. processing happens
independent of airflow), while the other lines represent data copying tasks that go through
a temp file on the airflow host https://i.imgur.com/2tDKgKj.png
> I'm seeing a system load of around 4.0-5.0 when processing tasks when one scheduler is
running, and 20.0-30.0 with two.
> Running {{airflow scheduler --num_runs 3}} under yappi got me these results when ordered
by total time: http://pastebin.com/8TiEG4P3. I still have the raw profiling data, let me know
if another data extract would be useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message