airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cieplucha, Michal" <michal.cieplu...@intel.com>
Subject slow scheduler
Date Wed, 04 Apr 2018 12:08:30 GMT
Hello all,

Our automation with airflow is getting bigger and bigger (airflow 1.8, ~150 DAGs, 3xinstances
of scheduler) . Sometimes our users are triggering DAG runs based on some external events,
so we exposed an API endpoint to run a DAG. Those DAGs that are run manually should give fast
feedback to the user, but we see that it takes few minutes to schedule first task, and often
next few minutes between tasks. So the most time is consumed between tasks, task durations
are just some seconds. Does anybody have those issues? It looks like scheduler often have
empty loops with logs like:
2018-04-04 12:05:45,004:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Starting Loop...
2018-04-04 12:05:45,005:INFO:airflow.jobs.SchedulerJob:[CT=None] Heartbeating the process
manager
2018-04-04 12:05:45,005:INFO:airflow.jobs.SchedulerJob:[CT=None] Heartbeating the executor
2018-04-04 12:05:45,005:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 44
running task instances
2018-04-04 12:05:45,005:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 0
in queue
2018-04-04 12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 340
open slots
2018-04-04 12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] Calling
the <class 'airflow.executors.celery_executor.CeleryExecutor'> sync method
2018-04-04 12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] Inquiring
about 44 celery task(s)
2018-04-04 12:05:45,744:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Ran scheduling loop in 0.74s
2018-04-04 12:05:45,745:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Sleeping for 1.00s

Maybe we need to tune airflow settings?
We have up to 250 unacked messages on rabbit queue, which translates to number of running
task instances, there is a lot going on in our airflow instance but apart from that scheduling
issue everything looks fine (cpu/memory usage, etc).
Our general settings:
6x dockers with workers, parallelism is 384, dag concurrency 128 and celeryd_concurrency 64

Our scheduler config section:
job_heartbeat_sec = 5
scheduler_heartbeat_sec = 5
max_threads = 2


thanks
mC


I am an Intel employee. All comments and opinions are my own and do not represent the views
of Intel.


--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy
Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac
informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie
nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the
intended recipient(s). If you are not the intended recipient, please contact the sender and
delete all copies; any review or distribution by
others is strictly prohibited.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message