airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxime Beauchemin <maximebeauche...@gmail.com>
Subject Re: Airflow 1.8.0 BETA 1
Date Fri, 20 Jan 2017 21:36:09 GMT
Hi all,

I need some input around this progressive upgrade idea I had recently.

At Airbnb we have many queues of workers, and I was entertaining the idea
of rolling out 1.8.0beta in production on a per worker or per-queue basis
to minimize the risks around upgrading.  This of course assumes that
heterogenous version of Airflow can live in the same cluster. Knowing that
the contract between the scheduler and the worker is pretty simple, this
may work for most upgrades where that contract isn't altered.

I'm reaching out to the community to ask whether people can think of
reasons why this would not work based on the change set between 1.7 and
1.8.  I also want to share this idea to try to prevent modifying the
scheduler/worker contract as much as possible to allow for this progressive
rollout-type deployment in the future.

Let me try to detail the scheduler/worker contract here as I understand it:
* a bash command is sent from the scheduler to the worker, that command is
of the `airflow run --local` flavor, this format format hasn't changed in
ages (shouldn't be a problem)
* both parties should agree on `are_dependencies_met`, or at least the
worker needs to answer True wherever the scheduler says True (shouldn't be
a problem)
* DAG files need to be compatible across versions (shouldn't be a problem
as we're committed to support backwards compatibility for DAG definitions)
* The TaskInstance model, especially around state handling need to be
compatible (maybe a problem), if any alembic migration has taken place, the
new table structure need to work with the previous model, this works if we
add a column for instance, but may not work if a column is removed. (with
the introduction of new state like SCHEDULED and changes in the dependency
engine, I'm unclear whether it's an issue)

As for the upgrading the scheduler in a progressive way, we may want to add
a dag_id regex matching to the scheduler subcommand so that we could have
two schedulers running on either version, but each one would be in charge
of scheduling a subset of the DAGs.

Thoughts?

Max



On Fri, Jan 20, 2017 at 12:47 AM, Bolke de Bruin <bdbruin@gmail.com> wrote:

> 1. Always do backups
> 2. Your airflow.cfg will work, but you might want to adjust some settings
> that are new
> 3. Pip install https://dist.apache.org/repos/dist/dev/incubator/airflow/
> airflow-1.8.0b1+apache.incubating.tar.gz should work.
>
> > On 19 Jan 2017, at 23:25, Boris Tyukin <boris@boristyukin.com> wrote:
> >
> > I'd like to test it on my VM with the code I am working on but I do not
> > know how to upgrade from 1.7. Can I use pip to pull it from github? maybe
> > someone can give me directions - i am very new to python. Also will it
> mess
> > my airflow.cfg or something else I need to backup?
> >
> > On Wed, Jan 18, 2017 at 4:38 PM, Chris Riccomini <criccomini@apache.org>
> > wrote:
> >
> >> We are switching to 1.8.0b1 this week--both dev and prod. Will keep you
> >> posted.
> >>
> >> On Wed, Jan 18, 2017 at 11:51 AM, Alex Van Boxel <alex@vanboxel.be>
> wrote:
> >>
> >>> Hey Max,
> >>>
> >>> As I'm missing the 1.7.2 labels I compared to the 172 branch. Can you
> >> have
> >>> a look at PR 2000. Its also sanitised, removing some of the commits
> that
> >>> doesn't bring value to the users.
> >>>
> >>> On Wed, Jan 18, 2017, 02:51 Maxime Beauchemin <
> >> maximebeauchemin@gmail.com>
> >>> wrote:
> >>>
> >>>> Alex, for the CHANGELOG.md, I've been using `github-changes`, a js app
> >>> that
> >>>> make changelog generation flexible and easy.
> >>>>
> >>>> https://www.npmjs.com/package/github-changes
> >>>>
> >>>> Command looks something like:
> >>>> `github-changes -o apache -r incubator-airflow --token <YOUR GH API
> >>> TOKEN>
> >>>> --between-tags 1.7.2...1.8.0beta` (tags may differ, it's easy to get
a
> >>>> token on your GH profile page)
> >>>>
> >>>> This will write a `CHANGELOG.md` in your cwd that you can just add on
> >> top
> >>>> of the existing one
> >>>>
> >>>> Max
> >>>>
> >>>> On Jan 17, 2017 3:37 PM, "Dan Davydov" <dan.davydov@airbnb.com.
> >> invalid>
> >>>> wrote:
> >>>>
> >>>>> So it is, my bad. Bad skills with ctrl-f :).
> >>>>>
> >>>>> On Tue, Jan 17, 2017 at 3:31 PM, Bolke de Bruin <bdbruin@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>>> Arthur's change is already in!
> >>>>>>
> >>>>>> B.
> >>>>>>
> >>>>>> Sent from my iPhone
> >>>>>>
> >>>>>>> On 17 Jan 2017, at 22:20, Dan Davydov <dan.davydov@airbnb.com
> >>>> .INVALID>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Would be good to cherrypick Arthur's fix into here if possible:
> >>>>>>> https://github.com/apache/incubator-airflow/pull/1973/files
> >>> (commit
> >>>>>>> 43bf89d)
> >>>>>>>
> >>>>>>> The impersonation stuff should be wrapping up shortly pending
> >>> Bolke's
> >>>>>>> comments.
> >>>>>>>
> >>>>>>> Also agreed with Max on the thanks. Thanks Alex too for
the
> >> change
> >>>> log!
> >>>>>>>
> >>>>>>> On Tue, Jan 17, 2017 at 10:05 AM, Maxime Beauchemin <
> >>>>>>> maximebeauchemin@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Bolke, I couldn't thank you enough for driving the release
> >>> process!
> >>>>>>>>
> >>>>>>>> I'll coordinate with the Airbnb team around
> >> impersonation/CGROUPs
> >>>> and
> >>>>> on
> >>>>>>>> making sure we put this release in our staging ASAP.
We have our
> >>>>>> employee
> >>>>>>>> conference this week so things are slower, but we'll
be back at
> >>> full
> >>>>>> speed
> >>>>>>>> Friday.
> >>>>>>>>
> >>>>>>>> Max
> >>>>>>>>
> >>>>>>>>> On Mon, Jan 16, 2017 at 3:51 PM, Alex Van Boxel
<
> >>> alex@vanboxel.be>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Hey Bolke, thanks great wotk. I'll handle the CHANGELOG,
and
> >> add
> >>>> some
> >>>>>>>>> documentation about triggers with branching operators.
> >>>>>>>>>
> >>>>>>>>> About the Google Cloud Operators: I wouldn't call
it feature
> >>>>>> complete...
> >>>>>>>> it
> >>>>>>>>> never is.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Jan 16, 2017 at 11:24 PM Bolke de Bruin
<
> >>> bdbruin@gmail.com
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Dear All,
> >>>>>>>>>>
> >>>>>>>>>> I have made the first BETA of Airflow 1.8.0
available at:
> >>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/
<
> >>>>>>>>>> https://dist.apache.org/repos/dist/dev/incubator/airflow/>
,
> >>>> public
> >>>>>>>> keys
> >>>>>>>>>> are available at
> >>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/
> >> <
> >>>>>>>>>> https://dist.apache.org/repos/dist/release/incubator/airflow/
> >>>
> >>> .
> >>>> It
> >>>>>> is
> >>>>>>>>>> tagged with a local version “apache.incubating”
so it allows
> >>>>> upgrading
> >>>>>>>>> from
> >>>>>>>>>> earlier releases. This beta is available for
testing in a more
> >>>>>>>> production
> >>>>>>>>>> like setting (acceptance environment?).
> >>>>>>>>>>
> >>>>>>>>>> I would like to encourage everyone  to try it
out, to report
> >>> back
> >>>>> any
> >>>>>>>>>> issues so we get to a rock solid release of
1.8.0. When
> >>> reporting
> >>>>>>>> issues
> >>>>>>>>> a
> >>>>>>>>>> test case or even a fix is highly appreciated.
> >>>>>>>>>>
> >>>>>>>>>> By moving to beta, we are also in feature freeze
mode. Meaning
> >>> no
> >>>>>> major
> >>>>>>>>>> adjustments or additions can be made to the
v1-8-test branch.
> >>>> There
> >>>>> is
> >>>>>>>>> one
> >>>>>>>>>> exception: the cgroups+impersonation patch.
I was assured that
> >>>>> before
> >>>>>>>> we
> >>>>>>>>>> merge that it will be thoroughly tested, so
its can still
> >> enter
> >>>> 1.8
> >>>>> if
> >>>>>>>>>> within a reasonable time frame. A lot of work
has gone into it
> >>> and
> >>>>> it
> >>>>>>>>> would
> >>>>>>>>>> be a shame if we would lose momentum.
> >>>>>>>>>>
> >>>>>>>>>> Finally, it would also be really nice of have
some updates to
> >>> the
> >>>>>>>>>> documentation. In order of importance:
> >>>>>>>>>>
> >>>>>>>>>> * UPDATING.md What does a user need to think
of when upgrading
> >>> to
> >>>>> 1.8?
> >>>>>>>>>> MySQL 5.6.4 is now minimally required, scheduler
now has
> >>> separate
> >>>>> logs
> >>>>>>>>> per
> >>>>>>>>>> file processor.
> >>>>>>>>>> * docs/configuration.rst We have many new options,
especially
> >> in
> >>>> the
> >>>>>>>>>> scheduler area
> >>>>>>>>>> * docs/faq.rst
> >>>>>>>>>> * CHANGELOG.txt (compiled from git log)
> >>>>>>>>>> * swagger definitions for the API
> >>>>>>>>>>
> >>>>>>>>>> HIGHLIGHTS of the beta:
> >>>>>>>>>>
> >>>>>>>>>> * DAG catchup: If False the scheduler does not
fill in the
> >> gaps
> >>>>>> between
> >>>>>>>>>> the start_date and the current_date. Can be
specified per dag
> >> or
> >>>>>>>> globally
> >>>>>>>>>> * Per DAG multi processing: More robust and
faster DAG
> >>>> processing. A
> >>>>>>>>>> faulty DAG should not take down the scheduler
any more
> >>>>>>>>>> * Google Cloud Operators: Feature complete I
have heard.
> >>>>>>>>>> * Time units now dynamic UI
> >>>>>>>>>> * Better SMTP handling and attachment support
> >>>>>>>>>> * Operational metrics for the scheduler
> >>>>>>>>>> * MSSQL Improvements
> >>>>>>>>>> * Experimental Rest API with Kerberos support
> >>>>>>>>>> * Auto alignment of start_date to interval
> >>>>>>>>>> * Better support for sub second scheduling
> >>>>>>>>>> * Rolling restart of web workers
> >>>>>>>>>> * nvd3.js instead of highcharts
> >>>>>>>>>> * New dependency engine making debugging why
my task is
> >> running
> >>>>> easier
> >>>>>>>>>> * Many UI updates
> >>>>>>>>>> * Many new operators
> >>>>>>>>>> * Many, many, many bugfixes
> >>>>>>>>>>
> >>>>>>>>>> RELEASE PLANNING
> >>>>>>>>>>
> >>>>>>>>>> Beta 2: 20 Jan
> >>>>>>>>>> Beta 3: 25 Jan
> >>>>>>>>>> RC1:  2 Feb
> >>>>>>>>>>
> >>>>>>>>>> Cheers
> >>>>>>>>>> Bolke
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>> _/
> >>>>>>>>> _/ Alex Van Boxel
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message