airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremiah Lowin <jlo...@gmail.com>
Subject Re: Sync Git code among Airflow workers
Date Wed, 13 Jul 2016 18:42:16 GMT
I have a little module for this that was designed to facilitate syncing a
git repo in Kubernetes: https://github.com/jlowin/git-sync. The idea is to
sync a volume that is then shared to all containers (webserver, scheduler,
workers, etc). It also works locally.

However I want to stress that this is absolutely 100% unsupported (by me)!
It's an experiment that's works well enough for my use case. Maybe it's a
useful jumping off point?

Best,
J
On Wed, Jul 13, 2016 at 2:35 PM Chris Riccomini <criccomini@apache.org>
wrote:

> Most folks follow a push-based approach (puppet, chef, etc).
>
> Our approach is CRON-based pull, described here:
>
> https://wecode.wepay.com/posts/airflow-wepay
>
> On Wed, Jul 13, 2016 at 11:10 AM, Fernando San Martin <fernando@turo.com>
> wrote:
> > At Turo we have our data pipeline organized as a set of Python & SQL jobs
> > orchestrated by Jenkins. We are evaluating Airflow as an alternative and
> we
> > have managed to get quite far but we have some questions that we were
> > hoping to get help with from the community.
> >
> > We have a set-up with a master node and two workers, our code was
> deployed
> > in all three boxes by retrieving from a git repo. The code in the repo
> > changes on a regular basis and we need to keep the boxes with the latest
> > version of the code.
> >
> > We first thought of adding to the top of our DAGs a BashOperator task
> that
> > simply runs `git pull origin master`, but since this code gets only
> > executed in the workers, the master node will eventually differ from the
> > code that is on the workers.
> >
> > Another option is to run a cron job that executes `git pull origin
> master`
> > in each box every 5-mins or so.
> >
> > Are there recommendations or best practices on how to handle this
> situation?
> >
> > Thank you!
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message