airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Imberman <daniel.imber...@gmail.com>
Subject Re: Airflow on ECS
Date Mon, 06 Nov 2017 21:04:44 GMT
Hi Shoumitra,

One thing worth noting is that with the release of the kubernetes executor,
we will be using resource versions + the Kubernetes API to take care of
some of the current issues with crash handling (basically recreating state
from what tasks have been run/are pending within the cluster). The
kubernetes executor also offloads all tasks to individual pods so you will
not need to worry about the resources of any tasks affecting the scheduler.

If you're available (and in SF) on Dec. 4th, we will be discussing the PR
at airbnb for the airflow meetup.

Hope to see you there!

https://www.meetup.com/Bay-Area-Apache-Airflow-Incubating-Meetup/events/244525050/

On Mon, Nov 6, 2017 at 9:39 AM Michael Erdely <mjerdely@gmail.com> wrote:

> Hi Shoumitra,
>
> As others have mentioned, there are a lot of issues when using the local
> executor in prod. However, at OfferUp, we have had success in running
> Airflow dockerized on EC2.
>
> Our current setup is the following:
>
>    - Airflow 1.8.2 dockerized similar to Matthieu's Celery example at
>    https://github.com/puckel/docker-airflow
>    - Running scheduler, webserver, flower, and 5 workers on a c4.8xlarge
>    EC2 instance
>    - RDS hosted Postgres
>    - ElastiCache hosted Redis
>
> We are close to the limits of this setup and plan on redoing our
> configuration with terraform. Not sure if we'll keep the dockerized setup
> but it's been extremely helpful thus far.
>
> -Michael
>
>
>
> On Thu, Nov 2, 2017 at 11:27 AM Marc Bollinger <marc@lumoslabs.com> wrote:
>
> > We're actively following the Airflow/Kubernetes integration
> > <https://issues.apache.org/jira/browse/AIRFLOW-1314>, and are eventually
> > going to move to both running everything on k8s and using
> > KubernetesExecutors for many things, but we've deployed Airflow to ECS
> from
> > day one. It works mostly fine, and we're using a tool we open-sourced
> > called Broadside <https://github.com/lumoslabs/broadside> to simplify
> > configuration and deployment. Our deploy is broken up into one scheduler,
> > one Flower instance, a few web servers, and a number of workers, using
> > CeleryExecutor backed by redis/Elasticache (and RDS postgres, as you're
> > suggesting), all in ECS from the same private docker image.
> >
> > Tacking on to what Bolke is saying, it is also somewhat tricky in our
> > experience to get deploys right in ECS with CeleryExecutors. Our first
> > impulse was to bake the DAG directory/repo into the docker image and run
> an
> > ECS deploy every time we added or updated DAGs, bouncing all of the
> > components and killing the workers. Where we wound up is that our CI
> system
> > still bakes the DAG directory into the images when we merge to master,
> but
> > for a "short" deploy we only bounce the web server and scheduler--the
> > worker containers all just execute `git pull` and pull down the
> new/updated
> > DAGs. Others may have different approaches that work, I'm sure, possibly
> > moving the DAG directory to a shared EFS mount.
> >
> > On Thu, Nov 2, 2017 at 11:06 AM, Bolke de Bruin <bdbruin@gmail.com>
> wrote:
> >
> > > Please remember that with the LocalExecutor your tasks run in
> > > process(group) with the scheduler. If you want to restart the
> scheduler,
> > it
> > > will need to wait until all tasks have finished that are currently
> > running.
> > > In addition if you tasks are resource intensive (cpu, memory) this can
> > also
> > > affect the scheduler. In 1.9.0 we are a little bit more robust in this
> > > respect, but guarding against OOM errors is very hard.
> > >
> > > Furthermore, the new logging framework in 1.9.0, will allow you to have
> > > logs centrally which might be convenient. However, documentation is not
> > up
> > > to date so you will have to tune it yourself.
> > >
> > > My 2 cents,
> > >
> > > Bolke.
> > >
> > > > On 2 Nov 2017, at 18:55, Shoumitra Srivastava <
> shoumitra362@gmail.com>
> > > wrote:
> > > >
> > > > Hi guys,
> > > >
> > > > So far we have had a lot of success testing out Airflow and we are
> now
> > > > going for a full scale deployment. To that end, we are considering
> > > > dockerizing airflow and deploying it on one of our ECS clusters. We
> are
> > > > planning on separating out the web server and the scheduler to
> separate
> > > > tasks and using local executor with an RDS postgres and redis
> backend.
> > > Does
> > > > anyone else have any suggestions regarding the setup? Any design
> > patterns
> > > > or good practises and gotchas would be welcome.
> > > >
> > > > -Shoumitra
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message