airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Naqvi <ali.na...@conornash.com>
Subject Re: large number of messages from celeryev pile up in rabbitmq
Date Tue, 20 Jun 2017 19:36:02 GMT
Interesting.

The webserver doesn't use the celery broker to identify whether the task is
complete, rather it takes task status information from the database.

So it seems that in that case for some reason the worker wasn't updating
the database.

Any ways good to know that you resolved the issue another way.



On 20 Jun 2017 9:06 am, "Georg Walther" <georg.walther@markovian.com> wrote:

I have tried using "CELERY_IGNORE_RESULT = True" but my experience with
this was that while task instances would be executed the Airflow webserver
would not show them as having executed successfully.
I.e. I lacked the visual feedback in the webserver that tasks / dags had
actually been executed successfully.

For me the winning setup appears to be:

- RabbitMQ as message broker ("broker_url" in airflow.cfg)
- Redis as Celery result backend ("celery_result_backend")


Best,

Georg


On Mon, Jun 19, 2017 at 8:12 PM, Ali Naqvi <ali.naqvi@conornash.com> wrote:

> We had this issue before where the number of total messages in rabbitmq
> would reach 1000 and no dags or tasks would run.
>
> I was able to resolve it by modifying one of the installation files
> celery_executor.py, then just adding:
>
> ```
>
> CELERY_IGNORE_RESULT = True
>
> ```
>
> under the CeleryConfig class in the airflow/executors/celeryexecutor.py
> file
> https://github.com/apache/incubator-airflow/blob/master/
> airflow/executors/celery_executor.py#L39
>
> Why 1000? because the amqp backend we had for celery had a hard stop of
> messages at 1000
> https://github.com/celery/celery/blob/master/celery/backends/amqp.py#L152
>
> Cheers,
> Ali
>
> On Sun, Jun 18, 2017 at 1:30 PM, Dmitry Smirnow <dmitry@yousician.com>
> wrote:
>
> > Hi Walther,
> >
> > Thank you for suggestion!
> >
> > No, I use mysql as results backend, but it seems that flower uses same
> > queue as results backend for it's monitoring purposes.
> > Issue seemed to resolve itself after I restarted all the workers and
> > changed the setup of rabbitmq to remove the queues that have no
> consumers.
> >
> > Best, Dima
> >
> > On Sun, Jun 18, 2017 at 4:55 PM, Georg Walther <
> > georg.walther@markovian.com>
> > wrote:
> >
> > > Hi Dima,
> > >
> > >
> > > do you use RabbitMQ as the Celery result backend?
> > > If so try using e.g. Redis as result backend (parameter
> > > "celery_result_backend" in the airflow.cfg) while
> > > keeping RabbitMQ as the message broker (broker_url).
> > >
> > >
> > > Best,
> > >
> > > Georg
> > >
> > > On Mon, May 29, 2017 at 1:59 PM, Dmitry Smirnow <dmitry@yousician.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I've noticed that in the rabbitmq which is used as a broker for
> > Airflow,
> > > > there are thousands of heartbeat messages from workers piling up
> (type:
> > > > "worker-heartbeat"). The version I use is 1.7.1.3.
> > > >
> > > > I googled around and it seems that those are the events used by
> celery
> > > > flower for monitoring.
> > > > I may misunderstood something, but it seemed that to stop those
> > messages
> > > I
> > > > should for example set some celery settings to make the unused
queues
> > > > expire.
> > > > What would be the right way to deal with it? I'm really not sure
> which
> > > > config should I touch. Any ideas are welcome and if I need to
provide
> > > more
> > > > info about configuration - please suggest which one.
> > > >
> > > > Thank you in advance,
> > > > Best regards, Dima
> > > >
> > > > --
> > > >
> > > > Dmitry Smirnov (MSc.)
> > > > Data Engineer @ Yousician
> > > > mobile: +358 50 3015072
> > > >
> > >
> >
> >
> >
> > --
> >
> > Dmitry Smirnov (MSc.)
> > Data Engineer @ Yousician
> > mobile: +358 50 3015072
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message