airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arthur Purvis <apur...@lumoslabs.com>
Subject Re: Airflow kubernetes executor
Date Wed, 12 Jul 2017 18:55:45 GMT
for what it's worth we've been running airflow on ECS for a few years
already.

On Wed, Jul 12, 2017 at 12:21 PM, Grant Nicholas <
grantnicholas2015@u.northwestern.edu> wrote:

> Is having a static set of workers necessary? Launching a job on Kubernetes
> from a cached docker image takes a few seconds max. I think this is an
> acceptable delay for a batch processing system like airflow.
>
> Additionally, if you dynamically launch workers you can start dynamically
> launching *any type* of worker and you don't have to statically allocate
> pools of worker types. IE) A single DAG could use a scala docker image to
> do spark calculations, a C++ docker image to use some low level numerical
> library,  and a python docker image by default to do any generic airflow
> stuff. Additionally, you can size workers according to their usage. Maybe
> the spark driver program only needs a few GBs of RAM but the C++ numerical
> library needs many hundreds.
>
> I agree there is a bit of extra book-keeping that needs to be done, but
> the tradeoff is an important one to explicitly make.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message