airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Bollinger <m...@lumoslabs.com>
Subject Re: Airflow on ECS
Date Thu, 02 Nov 2017 18:27:13 GMT
We're actively following the Airflow/Kubernetes integration
<https://issues.apache.org/jira/browse/AIRFLOW-1314>, and are eventually
going to move to both running everything on k8s and using
KubernetesExecutors for many things, but we've deployed Airflow to ECS from
day one. It works mostly fine, and we're using a tool we open-sourced
called Broadside <https://github.com/lumoslabs/broadside> to simplify
configuration and deployment. Our deploy is broken up into one scheduler,
one Flower instance, a few web servers, and a number of workers, using
CeleryExecutor backed by redis/Elasticache (and RDS postgres, as you're
suggesting), all in ECS from the same private docker image.

Tacking on to what Bolke is saying, it is also somewhat tricky in our
experience to get deploys right in ECS with CeleryExecutors. Our first
impulse was to bake the DAG directory/repo into the docker image and run an
ECS deploy every time we added or updated DAGs, bouncing all of the
components and killing the workers. Where we wound up is that our CI system
still bakes the DAG directory into the images when we merge to master, but
for a "short" deploy we only bounce the web server and scheduler--the
worker containers all just execute `git pull` and pull down the new/updated
DAGs. Others may have different approaches that work, I'm sure, possibly
moving the DAG directory to a shared EFS mount.

On Thu, Nov 2, 2017 at 11:06 AM, Bolke de Bruin <bdbruin@gmail.com> wrote:

> Please remember that with the LocalExecutor your tasks run in
> process(group) with the scheduler. If you want to restart the scheduler, it
> will need to wait until all tasks have finished that are currently running.
> In addition if you tasks are resource intensive (cpu, memory) this can also
> affect the scheduler. In 1.9.0 we are a little bit more robust in this
> respect, but guarding against OOM errors is very hard.
>
> Furthermore, the new logging framework in 1.9.0, will allow you to have
> logs centrally which might be convenient. However, documentation is not up
> to date so you will have to tune it yourself.
>
> My 2 cents,
>
> Bolke.
>
> > On 2 Nov 2017, at 18:55, Shoumitra Srivastava <shoumitra362@gmail.com>
> wrote:
> >
> > Hi guys,
> >
> > So far we have had a lot of success testing out Airflow and we are now
> > going for a full scale deployment. To that end, we are considering
> > dockerizing airflow and deploying it on one of our ECS clusters. We are
> > planning on separating out the web server and the scheduler to separate
> > tasks and using local executor with an RDS postgres and redis backend.
> Does
> > anyone else have any suggestions regarding the setup? Any design patterns
> > or good practises and gotchas would be welcome.
> >
> > -Shoumitra
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message