airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Rodrigues <rodr...@crodrigues.org>
Subject Re: celery problem: cannot override celery_broker_transport_options
Date Thu, 24 May 2018 15:32:44 GMT
It looks like in Celery, the documentation for sqla broker was removed:

https://github.com/celery/celery/commit/79810a26a116e9881c42a14d856fa94c40fefcd8#diff-29ccf8c96d521253467909a652e6ded2

I cannot find the pull request or release notes which document this.

--
Craig


On Thu, May 24, 2018 at 8:19 AM Ash Berlin-Taylor <
ash_airflowlist@firemirror.com> wrote:

> Sounds like
> https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py#L31
> <
> https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py#L31>
> should be guarded in some way to only do that for a redis:// and sqs://
> backends.
>
>
> > On 24 May 2018, at 16:13, Craig Rodrigues <rodrigc@crodrigues.org>
> wrote:
> >
> > Ash,
> >
> > According to this:
> >
> http://docs.celeryproject.org/en/latest/userguide/configuration.html#broker-settings
> > visibility_timeout is supported by Redis and SQS.
> >
> > --
> > Craig
> >
> >
> > On Thu, May 24, 2018 at 8:07 AM Craig Rodrigues <rodrigc@crodrigues.org>
> > wrote:
> >
> >> Ash,
> >>
> >> Thanks again.  You are leading me on the right path!
> >>
> >> I can prepare a patch to move the ssl_ options into the celery section.
> >>
> >> What about visbility_timeout?  The error I am getting is:
> >>
> >> File
> "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py",
> >> line 160, in create
> >>    engineclass.__name__))
> >> TypeError: Invalid argument(s)
> >> 'ssl_key','ssl_cert','ssl_active','visibility_timeout','ssl_cacert'
> sent to
> >> create_engine(), using configuration
> >> MySQLDialect_mysqldb/QueuePool/Engine.  Please check that the keyword
> >> arguments are appropriate for this combination of components.
> >>
> >>
> >> So it looks like visibility_timeout does not work with sqla as well.
> >> --
> >> Craig
> >>
> >>
> >> On Thu, May 24, 2018 at 2:17 AM Ash Berlin-Taylor <
> >> ash_airflowlist@firemirror.com> wrote:
> >>
> >>> Yes, you would need to duplicate a chunk of the default_celery in your
> >>> copy right now. But you can just make it have the values you want - so
> it
> >>> would be about 10 lines in total.
> >>>
> >>> It seems that between AIRFLOW-966 and AIRFLOW-1840 things got a little
> >>> bit out of sync in the default .cfg and the celery .py - the .py is
> looking
> >>> for celery->ssl_* but the default config puts it in
> >>> celery_broker_transport_options->ssl_*. Looking at the celery config
> option
> >>> it looks like they aren't actually options for the other transports,
> it's
> >>> just that they don't complain about the extra options. I think the fix
> is
> >>> just to move them up to the celery section.
> >>>
> >>> -ash
> >>>
> >>>
> >>>> On 24 May 2018, at 07:21, Craig Rodrigues <rodrigc@crodrigues.org>
> >>> wrote:
> >>>>
> >>>> Ash,
> >>>>
> >>>> Thanks!   You put me on the right track.
> >>>> Unfortunately, there is a lot of logic in
> >>>> airflow/config_templates/default_celery.py that I need,
> >>>> and if I was to come up with my own class to replace:
> >>>>
> >>>> celery_config_options =
> >>>> airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG
> >>>>
> >>>> then I would basically have to rewrite my own version of
> >>> default_celery.py
> >>>>
> >>>> Also, in airflow/config_templates/default_airflow.cfg, there is this:
> >>>>
> >>>> [celery_broker_transport_options]
> >>>> # The visibility timeout defines the number of seconds to wait for the
> >>>> worker
> >>>> # to acknowledge the task before the message is redelivered to another
> >>>> worker.
> >>>> # Make sure to increase the visibility timeout to match the time of
> the
> >>>> longest
> >>>> # ETA you're planning to use. Especially important in case of using
> >>> Redis
> >>>> or SQS
> >>>> visibility_timeout = 21600
> >>>>
> >>>> # In case of using SSL
> >>>> ssl_active = False
> >>>> ssl_key =
> >>>> ssl_cert =
> >>>> ssl_cacert =
> >>>>
> >>>>
> >>>> None of those options work if using an sqla backend for celery
> >>> broker_url.
> >>>>
> >>>> I need to think about this, but this needs to be cleaned up before
> >>>> Airflow 1.10 is released.  In
> >>> airflow/config_templates/default_airflow.cfg
> >>>> there is this:
> >>>>
> >>>> broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
> >>>>
> >>>>
> >>>> So sqla is specified as the default broker_url for celery.
> >>>> A lot of people (including where I work) have used this template to
> set
> >>> up
> >>>> Airflow + Celery, and even though sqla is an "experimental"
> broker_url,
> >>>> it actually works pretty well.
> >>>>
> >>>> Now in airflow 1.10, something that was very easy to set up is now
> >>> really
> >>>> complicated and unintuitive.
> >>>>
> >>>> Would it be OK to change the airflow code so that
> >>>> in airflow/config_templates/default_airflow.cfg, all the options in
> >>>> [celery_broker_transport_options] are commented out?
> >>>> And if someone is running Redis, they would have to add those
> >>>> options in their own airflow.cfg file?
> >>>>
> >>>> Bolke, do you have any comments?
> >>>>
> >>>> --
> >>>> Craig
> >>>>
> >>>> On Tue, May 22, 2018 at 1:50 AM Ash Berlin-Taylor <
> >>>> ash_airflowlist@firemirror.com> wrote:
> >>>>
> >>>>> To use with the SQLA backend to celery you need to override the
> options
> >>>>> Airflow passes to Celery. Those come from
> >>>>>
> >>>
> https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py
> >>>>>
> >>>>> Since you don't want most/all of those options (and there is no
way
> in
> >>> the
> >>>>> config file to _remove_ a setting) you will have to point airflow
to
> a
> >>>>> different file for the celery config:
> >>>>>
> >>>>> This line in the config is what you will need to change:
> >>>>>
> >>>>>   # Import path for celery configuration options
> >>>>>   celery_config_options =
> >>>>> airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG
> >>>>>
> >>>>> If you create something like config/celery_config.py containing:
> >>>>>
> >>>>>
> >>>>>   CELERY_CONFIG = {
> >>>>>       # Just the options you want to set
> >>>>>   }
> >>>>>
> >>>>>
> >>>>> (config/ should exist along side your dags/ folder, and I think
it
> >>> should
> >>>>> be added to the python path already). You can then set this in the
> >>> config:
> >>>>>
> >>>>>   celery_config_options = celery_config.CELERY_CONFIG
> >>>>>
> >>>>> That should give you complete control
> >>>>>
> >>>>>
> >>>
> >>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message