airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Schaefer <j...@apache.org>
Subject Re: troubleshooting hung docker airflow containers on 1.9.0
Date Tue, 20 Feb 2018 22:33:27 GMT


On 2018/02/20 21:57:50, Matthew Housley <matthew.housley@gmail.com> wrote: 
> Hi Joe,
> can you provide a few more details? Are you using the latest version of
> puckel/docker-airflow? What combination of database and executor are you

I'm not sure, but will get back to you tomorrow on the version.  The database
is postgres 9.6, with celery-something.  I'm sort of parachuting into this problem,
and only recently gained access to the airflow UI for it so I can do some digging
the next time it hangs.

> using, i.e. postgres with celery-redis? Do you know which container is
> hanging?

Basically our dags are just glorified crontab wrappers around shell scripts.  What's
happening is that the dags stop launching (we know this because the shell scripts
leave timestamps on the logs they write to when they launch and exit), and bad things
start piling up with the rest of the cluster.

We don't have a lot of dags and they are very simple, but one of them runs every minute
and others run every 15 or every hour.

> best,
> Matt
> 
> On 2018/02/20 16:43:29, Joe Schaefer <j...@apache.org> wrote:
> > Any tips for my situation?  I'm new to airflow and not sure what's going
> on in my rancher environment, but my airflow current stack requires
> frequent reboots to just maintain a working cluster.>
> >
> > If there are some docs I can learn from to do a deeper dive I'd
> appreciate it.>
> >
> > Thanks!>
> >
> 


Mime
View raw message