airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime Beauchemin <maximebeauche...@gmail.com>
Subject Sensor slots utilization
Date Fri, 28 Jul 2017 18:14:02 GMT
Thought his was interesting to bubble up to the mailing list. From:
https://github.com/apache/incubator-airflow/pull/2423#issuecomment-318723842

This is about the issue around sensors utilizing a lot of worker slots. The
context is a PR from @shaform introducing sensors that check once and give
up their slot and get reschedule for each sensing operation (as opposed to
the current behavior of sleeping and poking while constantly using the slot
until the criteria is met or timeout is reached)

---------------

*So this is legitimate, but shifts some of the burden of slot utilization
towards other costs like task startups costs and more communication
overhead. These costs may be preferable depending on the
scenario/environment. Starting a task can have significant overhead
depending on the size of the DAG and other factors that depend on the
executor. Say for the upcoming Kubernetes executor, startup may include
booting up a docker instance and doing a shallow clone of the repo.*

*Since this is a major change, I would argue that we shouldn't change the
current default since organizations have provisioned and stabilized their
environments based on the current behavior. Default behavior could be
changed when moving to 2.0, which isn't really planned or scheduled at the
moment.*

*Another idea around reducing the overall sensor slot utilization would be
to move that burden towards the scheduler (let's call it the supervisor now
since it does more than just scheduling at this point). My idea there was
to add a flag to BaseSensorOperator that would tell the scheduler to run
the poke method in line with the scheduling instead of using the executor.
In that scenario, there's no startup cost and no communication overhead.
The downside is that it can slow down the scheduler. This would be a great
option where sensing is cheap and fast*

*That gives us potentially 3 sensor_modes, which I would argue should be
implemented as a BaseOperator argument. Derivative classes can decide to
expose the argument or force it. Administrator could also use
the policy function to force certain sensing mode in certain or all
contexts in their environment.*

Max

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message