airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Rodrigues <rodr...@crodrigues.org>
Subject Re: celery problem: cannot override celery_broker_transport_options
Date Thu, 24 May 2018 15:34:35 GMT
Removal of sqla as a backend is mentioned in these release notes for celery
4.0:

http://docs.celeryproject.org/en/latest/whatsnew-4.0.html#features-removed-for-lack-of-funding

--
Craig


On Thu, May 24, 2018 at 8:32 AM Craig Rodrigues <rodrigc@crodrigues.org>
wrote:

> It looks like in Celery, the documentation for sqla broker was removed:
>
>
> https://github.com/celery/celery/commit/79810a26a116e9881c42a14d856fa94c40fefcd8#diff-29ccf8c96d521253467909a652e6ded2
>
> I cannot find the pull request or release notes which document this.
>
> --
> Craig
>
>
> On Thu, May 24, 2018 at 8:19 AM Ash Berlin-Taylor <
> ash_airflowlist@firemirror.com> wrote:
>
>> Sounds like
>> https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py#L31
>> <
>> https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py#L31>
>> should be guarded in some way to only do that for a redis:// and sqs://
>> backends.
>>
>>
>> > On 24 May 2018, at 16:13, Craig Rodrigues <rodrigc@crodrigues.org>
>> wrote:
>> >
>> > Ash,
>> >
>> > According to this:
>> >
>> http://docs.celeryproject.org/en/latest/userguide/configuration.html#broker-settings
>> > visibility_timeout is supported by Redis and SQS.
>> >
>> > --
>> > Craig
>> >
>> >
>> > On Thu, May 24, 2018 at 8:07 AM Craig Rodrigues <rodrigc@crodrigues.org
>> >
>> > wrote:
>> >
>> >> Ash,
>> >>
>> >> Thanks again.  You are leading me on the right path!
>> >>
>> >> I can prepare a patch to move the ssl_ options into the celery section.
>> >>
>> >> What about visbility_timeout?  The error I am getting is:
>> >>
>> >> File
>> "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/strategies.py",
>> >> line 160, in create
>> >>    engineclass.__name__))
>> >> TypeError: Invalid argument(s)
>> >> 'ssl_key','ssl_cert','ssl_active','visibility_timeout','ssl_cacert'
>> sent to
>> >> create_engine(), using configuration
>> >> MySQLDialect_mysqldb/QueuePool/Engine.  Please check that the keyword
>> >> arguments are appropriate for this combination of components.
>> >>
>> >>
>> >> So it looks like visibility_timeout does not work with sqla as well.
>> >> --
>> >> Craig
>> >>
>> >>
>> >> On Thu, May 24, 2018 at 2:17 AM Ash Berlin-Taylor <
>> >> ash_airflowlist@firemirror.com> wrote:
>> >>
>> >>> Yes, you would need to duplicate a chunk of the default_celery in your
>> >>> copy right now. But you can just make it have the values you want -
>> so it
>> >>> would be about 10 lines in total.
>> >>>
>> >>> It seems that between AIRFLOW-966 and AIRFLOW-1840 things got a little
>> >>> bit out of sync in the default .cfg and the celery .py - the .py is
>> looking
>> >>> for celery->ssl_* but the default config puts it in
>> >>> celery_broker_transport_options->ssl_*. Looking at the celery config
>> option
>> >>> it looks like they aren't actually options for the other transports,
>> it's
>> >>> just that they don't complain about the extra options. I think the
>> fix is
>> >>> just to move them up to the celery section.
>> >>>
>> >>> -ash
>> >>>
>> >>>
>> >>>> On 24 May 2018, at 07:21, Craig Rodrigues <rodrigc@crodrigues.org>
>> >>> wrote:
>> >>>>
>> >>>> Ash,
>> >>>>
>> >>>> Thanks!   You put me on the right track.
>> >>>> Unfortunately, there is a lot of logic in
>> >>>> airflow/config_templates/default_celery.py that I need,
>> >>>> and if I was to come up with my own class to replace:
>> >>>>
>> >>>> celery_config_options =
>> >>>> airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG
>> >>>>
>> >>>> then I would basically have to rewrite my own version of
>> >>> default_celery.py
>> >>>>
>> >>>> Also, in airflow/config_templates/default_airflow.cfg, there is
this:
>> >>>>
>> >>>> [celery_broker_transport_options]
>> >>>> # The visibility timeout defines the number of seconds to wait for
>> the
>> >>>> worker
>> >>>> # to acknowledge the task before the message is redelivered to
>> another
>> >>>> worker.
>> >>>> # Make sure to increase the visibility timeout to match the time
of
>> the
>> >>>> longest
>> >>>> # ETA you're planning to use. Especially important in case of using
>> >>> Redis
>> >>>> or SQS
>> >>>> visibility_timeout = 21600
>> >>>>
>> >>>> # In case of using SSL
>> >>>> ssl_active = False
>> >>>> ssl_key =
>> >>>> ssl_cert =
>> >>>> ssl_cacert =
>> >>>>
>> >>>>
>> >>>> None of those options work if using an sqla backend for celery
>> >>> broker_url.
>> >>>>
>> >>>> I need to think about this, but this needs to be cleaned up before
>> >>>> Airflow 1.10 is released.  In
>> >>> airflow/config_templates/default_airflow.cfg
>> >>>> there is this:
>> >>>>
>> >>>> broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
>> >>>>
>> >>>>
>> >>>> So sqla is specified as the default broker_url for celery.
>> >>>> A lot of people (including where I work) have used this template
to
>> set
>> >>> up
>> >>>> Airflow + Celery, and even though sqla is an "experimental"
>> broker_url,
>> >>>> it actually works pretty well.
>> >>>>
>> >>>> Now in airflow 1.10, something that was very easy to set up is now
>> >>> really
>> >>>> complicated and unintuitive.
>> >>>>
>> >>>> Would it be OK to change the airflow code so that
>> >>>> in airflow/config_templates/default_airflow.cfg, all the options
in
>> >>>> [celery_broker_transport_options] are commented out?
>> >>>> And if someone is running Redis, they would have to add those
>> >>>> options in their own airflow.cfg file?
>> >>>>
>> >>>> Bolke, do you have any comments?
>> >>>>
>> >>>> --
>> >>>> Craig
>> >>>>
>> >>>> On Tue, May 22, 2018 at 1:50 AM Ash Berlin-Taylor <
>> >>>> ash_airflowlist@firemirror.com> wrote:
>> >>>>
>> >>>>> To use with the SQLA backend to celery you need to override
the
>> options
>> >>>>> Airflow passes to Celery. Those come from
>> >>>>>
>> >>>
>> https://github.com/apache/incubator-airflow/blob/v1-10-test/airflow/config_templates/default_celery.py
>> >>>>>
>> >>>>> Since you don't want most/all of those options (and there is
no way
>> in
>> >>> the
>> >>>>> config file to _remove_ a setting) you will have to point airflow
>> to a
>> >>>>> different file for the celery config:
>> >>>>>
>> >>>>> This line in the config is what you will need to change:
>> >>>>>
>> >>>>>   # Import path for celery configuration options
>> >>>>>   celery_config_options =
>> >>>>> airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG
>> >>>>>
>> >>>>> If you create something like config/celery_config.py containing:
>> >>>>>
>> >>>>>
>> >>>>>   CELERY_CONFIG = {
>> >>>>>       # Just the options you want to set
>> >>>>>   }
>> >>>>>
>> >>>>>
>> >>>>> (config/ should exist along side your dags/ folder, and I think
it
>> >>> should
>> >>>>> be added to the python path already). You can then set this
in the
>> >>> config:
>> >>>>>
>> >>>>>   celery_config_options = celery_config.CELERY_CONFIG
>> >>>>>
>> >>>>> That should give you complete control
>> >>>>>
>> >>>>>
>> >>>
>> >>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message