airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jakob Homan <>
Subject Re: 1.10.0beta1 now available for download
Date Thu, 03 May 2018 14:03:01 GMT
Hey Bolke-
   I appreciate your patience; these types of discussions can
certainly drag on.  And again, I appreciate your effort to move this
release forward.

   I'd recommend bringing this discussion to the general@incubator.
There will be lots of folks there who will be ready to share an
opinion.  You're looking for a lightweight mechanism to package and
distribute some work for wider consumption in the hopes of the wider
community using it and improving it.  As a mentor, this looks at lot
to me like a release or a release process (announcing it on the
mailing list, using to host the artifacts, the
artifact being given a version and modifier without a community vote,
etc.).  I'm all for finding the way forward with this; again this is a
great effort and just needs to be done within the ASF framework.


On 2 May 2018 at 15:33, Bolke de Bruin <> wrote:
> Hi Jakob,
> I’m having the feeling we are on different wave lengths and we are not getting closer
> Remarks inline.
>> On 2 May 2018, at 22:56, Jakob Homan <> wrote:
>> Hey Bolke-
>>  Stabilizing the tree has nothing to do with getting a release
>> through IPMC.  The IPMC doesn't test the code - it only verifies that
>> the required licenses and legal obligations are met, that the release
>> artifacts meet the requirements to be processed through ASF's
>> publishing infra, etc.  Minor issues like a couple missing headers are
>> also now generally let through (since it's an incubator release), to
>> be fixed in the next go around.  And honestly, if a podling that
>> believes it's ready to graduate is still having trouble making sure
>> correct license headers are applied, that's a giant red flag that
>> there may be large remaining Apache Way issues to address.  Podlings
>> demonstrating their ability to follow ASF process and operate in the
>> Apache Way is the crux of the incubation process.
> The header XYZ was just an example. I think we are doing reasonably
> fine on the process with a couple of hiccups here and there. And some
> discussions like this of course ;-)
>>   Also, I entirely agree with you that it's much harder than it
>> should be to get the requisite votes from the IPMC.  I do quite a lot
>> of vote munging for all the podlings with which I'm involved and it's
>> always annoying.  The IPMC, like all of ASF, is made up of volunteers
>> and is not always as responsive as it should be.
> Fully understood and thank you for the effort!
>>  The process you describe as not having merit is a large part of the
>> ASF.  Specifically, preparing a release candidate is pretty well
>> documented across ASF [1,2,3,4].  The goals you describe (stabilizing
>> a new feature like the Kubernates executor, and rallying people to try
>> the code and fix bugs) are exactly what the RC process does as well.
>> And once the community gets experience in rolling RCs and running
>> release votes, it's actually not that much work.  Lots of projects
>> have multiple releases (main branches and bug fixes) going nearly
>> constantly.
> Here, I don’t follow you anymore. I don’t agree that the release process
> as described mentions Release Candidates at all. It mentions shepherding
> from an initial consensus to a final distribution. It doesn’t mention how to
> do this (e.g. by having a release candidate) it just mentions output criteria.
> So it is up to the release manager in consensus with the community
> how to get to a release. It is not stipulated that we need release candidates.
> The different projects seem to have different ways of shepherding.
>>  I'm not suggesting that we vote on alpha/beta releases.  I'm
>> pointing out that the goals of the alpha release as you describe them
>> match up very well with the goals of the RC process - which will need
>> to be done subsequently anyway.  I'm also saying that announcing
>> 'betas' doesn't really jive with how ASF expects artifacts to be
>> released or voted on for release (which, if an artifact is up on
>>, it most definitely appears to be).  My suggestion
>> would be to take this very meritorious effort and make it official.
>> Pick a release manager, create a branch, roll an RC, ask for people to
>> stabilize it, merge bug fixes but not features to the branch, repeat
>> until an RC passes a vote.
> I don’t think they match. Firstly, for the beta it is not a release and we don’t
> intend to make it one. Secondly, a Release Candidate has a meaning
> attached to it to our community. It says: We think it Ready for Production
> but we are not entirely sure yet (we dont give you a support contract).
> Together with Fokko, I am Release Manager for the upcoming 1.10.0 release.
> I’m not prepared (as Release Manager) to create a RC. I think we will expose
> people to too many risks and it requires more `shepherding` before we can
> put up a vote.
> We have branched v1-10-test. When we are ready to go to Release Candidate
> we will branch off v1-10-stable from v1-10-test. We called the current state of
> v1-10-test “beta” and we made a convenience tarball. How is this different from
> nightlies and snapshots?
> So we got all your boxes ticked. Except ‘roll an RC’. We are not in that
> phase yet. Sure the process looks a bit like it (hey we practice some steps
> of the Apache Way), but it is not. I don’t think anyone is confused by that.
> Nevertheless, I’m pretty surprised (astonished even) that some projects are
> indeed voting on Alphas and Betas. They call it releases as well. That’s a lot of
> effort for an artefact that will get little exposure [1]. Tomcat votes on Alphas
> as well [2]. Still, JMeter publishes snapshots in an Apache repo [3] and many
> others [5], but they are all Java projects.
> I still think it is a matter of semantics. The python community at large
> does not consider an “alpha” tag or “beta” tag to be a release (PEP-0440,
> a PEP has a RFC status)  (although it meanders a little bit in its
> wording ("Final Release”)), but a phase [4]
> .
> Apache itself does not seem to stipulate it, but its projects seem to consider a
> “alpha” or a “beta” to be a release and thus it needs to be voted upon. However,
> those projects do have snapshots and nightlies published. These types of non-
> releases are however very java-esque as in maven you cannot include a snapshot
> dependency when you do a release. This, however, also goes for “beta”, “alpha”
> “rc” with python’s package managers as they are not considered to be a release.
> As mentioned I am not going to tag a RC and create the artefacts for it now. I
> would feel irresponsible in doing so. We are pretty close but not there yet.
> I’m also not going to call out a vote for a pre-release beta, because before
> the vote has ended we will have another pre-release.
> So what to do? I have the feeling that we are stuck between a rock and hard place.
> Obviously if the community thinks otherwise and I am not seeing it correctly
> I’ll step down as release manager so someone else can pick it up.
> - Bolke
> [1]
> [2]
> [3]
> [4]
> [5]
>> Thanks,
>> Jakob
>> [1]
>> [2]
>> [3]
>> [4]
>> On 2 May 2018 at 10:40, Bolke de Bruin <> wrote:
>>> Hi Jakob,
>>> This ‘release’ is not effectively a RC. We want to have the kubernetes
>>> executor stabilised or at least passing its own tests before we like to move
>>> to RC status. People also tend to rally to have some extra bugfixes in or
>>> some extra features when we announce “beta” status. Given the fact that
>>> going from 1.9 to 1.10 is a big leap I think it is important to have
>>> period to funnel towards a RC/Release.
>>> Gotcha on httpd. However it still seems semantics to me. I would equal
>>> a Spark nightly somewhat to an Airflow alpha. A snapshot somewhat
>>> to a beta. Ie. for Airflow ‘alphas’ and ‘betas’ are not releases, not
from a
>>> process perspective and and not from a technical perspective.
>>> Practically, I think we need a way to stabilise the tree so we have
>>> a reasonable confidence we can pass a vote for ‘real release, which is a
>>> technical vote of confidence and a process vote of confidence. Voting
>>> on alphas (equivalent to a nightly) and betas would make this a very
>>> cumbersome process. Particularly as a podling: getting 3 votes at the IPMC
>>> is a tough process (I’ve been physically going around at a conference to
>>> obtain votes last year). If we then get a “no you can’t have a alpha because
>>> header XYZ is missing” it kind of defeats the purpose of having alphas
>>> from the process side (which you are basically saying). However, it still
>>> has a technical merit.
>>> What would your suggestion be? I’m really afraid of getting stuck
>>> in process and the process, to me currently, does not seem to have the merit
>>> we are looking for*. We might have a different understanding
>>> what we consider to be a ‘release’ though. So open to suggestions
>>> (also from the wider community here :) ).
>>> Cheers
>>> Bolke
>>> * dont misunderstand me here please, for Releases (e.g. 1.10.0 with no extra
>>> label) I’m quite okay.
>>>> On 1 May 2018, at 23:51, Jakob Homan <> wrote:
>>>> Hey-
>>>>  Correct, we can publish nightlies and SNAPSHOTs, but those are not
>>>> releases.  Also, if a community votes to consider a release alpha or
>>>> beta, it may do so (From the httpd link, "Based on the community's
>>>> confidence in the code, the potential release is tagged as alpha, beta
>>>> or general availability (GA) and the candidate and is voted in that
>>>> manner."), but this is an indicator of the technical quality of the
>>>> actual release, not the point in the release's lifecycle.
>>>>  My question is - if this  release is effectively an RC, why not
>>>> make it officially so? What's the goal of the beta compared to an RC?
>>>> As a mentor, I see an invitation for users to come and test some work
>>>> that could potentially be a release.  That's what we ask for during a
>>>> release process, along with the release manager activity, publishing
>>>> to specified locations, etc.  It would be good to demonstrate we can
>>>> do that well.
>>>> Thanks,
>>>> Jakob
>>>> On 1 May 2018 at 14:31, Bolke de Bruin <> wrote:
>>>>> Hi Jakob,
>>>>> To be honest I’m confused now. In software land (and I assume you know)
>>>>> Alpha -> Beta -> RC -> Release is well known and so well established
that I would
>>>>> be surprised if anyone got confused by that. Even the oldest project
from Apache
>>>>> have alpha-s and beta-s ( and
>>>>> called GA which is equal to a release I guess.
>>>>> If you would expect people to pick up from a git tag and build from there
and then report back
>>>>> to us, that doesn’t really happen. We are always having a challenge
to have enough test surface,
>>>>> that would diminish that surface.
>>>>> Other projects also “publish” other than voted upon artefacts. E.g.
Spark has nightly builds and SNAPSHOTS.
>>>>> A snapshot clearly has a different state than a nightly. Apache Flink
state that 1.4.2 is their latest stable release.
>>>>> So there seems to be a “non-stable” release as well. I did see that
their git repositories only mention “RC-X” tags
>>>>> or branches.
>>>>> Reading through
it does not mention anywhere
>>>>> that we need to have RCs. It just states that if you want to do a release
you need to call a vote and for distribution
>>>>> it must be at a certain location. As mentioned this is a “beta” which
is not a “release”. We haven’t released it either as
>>>>> it wasn’t voted upon and no vote was called. It was just made available
for convenience of the community.
>>>>> So I am not sure what is expected from us here. How do wo go though dev
-> test -> acc -> prod release process
>>>>> together with the community? The release process you seem to be referring
is only part of the last state imho. Or
>>>>> do we need to call a vote on every state change?
>>>>> Cheers
>>>>> Bolke
>>>>>> On 1 May 2018, at 22:47, Jakob Homan <> wrote:
>>>>>> Hey Bolke-
>>>>>> To be clear, I'm not suggesting anyone is trying to do anything
>>>>>> wrong.  Release wasn't mentioned, but a new tar ball with a new
>>>>>> version number with a 'beta' tag is published in some way for people
>>>>>> to come and test.  How is that different than the expected release/RC
>>>>>> process (specify a git point, offer a tar ball, add an RCx tag and
>>>>>> invite people to test that)?  Seems like a parallel process with
>>>>>> of similarities that could confuse both our end users and the IPMC.
>>>>>> Thanks,
>>>>>> Jakob
>>>>>> On 1 May 2018 at 13:08, Bolke de Bruin <>
>>>>>>> Hi Jakob,
>>>>>>> Understood. But isn’t that in this case not just wording? Ie.
this is a tar-ball that we think is beyond just developer testing (alpha) but more towards
the enthusiasts (beta) but not a version of the tarball that is for the general public to
test (RC) and not a Release (release)? Ie. is the issue in calling it a ‘release’ which
in this case is just meta for a tarball? In the original email in never mentioned the word
release in conjunction with the beta I think.
>>>>>>> Cheers
>>>>>>> Bolke
>>>>>>>> On 1 May 2018, at 22:01, Jakob Homan <>
>>>>>>>> Hey all-
>>>>>>>> With my Mentor hat on, I need to point out that ASF doesn't
>>>>>>>> have beta releases.  This work is awesome, but really needs
to go
>>>>>>>> through the proper steps.  The Release Candidate process
is pretty
>>>>>>>> well described:
 This is
>>>>>>>> particularly important since, as was mentioned, graduation
should be
>>>>>>>> imminent and this process will be heavily scrutinized.
>>>>>>>> -Jakob
>>>>>>>> On 1 May 2018 at 12:41, James Meickle <>
>>>>>>>>> Thanks for the pointer! I went through and set this up
today, using Google
>>>>>>>>> OAuth as the RBAC provider. Overall I'm quite enthusiastic
about this move,
>>>>>>>>> but I thought that it might be helpful to collect feedback
as someone who
>>>>>>>>> hasn't been following the overall process and is therefore
coming at it
>>>>>>>>> with fresh eyes.
>>>>>>>>> - The Flask appbuilder security documentation is poor
quality (e.g.,
>>>>>>>>> there's some broken sentences); if Airflow is to send
people there, it
>>>>>>>>> might be worth PRing some of the docs to at least look
more professional.
>>>>>>>>> - There's not much documentation out there on how to
properly set up an
>>>>>>>>> OAuth app in Google (in my case, using the G+ API). From
an adoption POV,
>>>>>>>>> it would be good to screenshot the (current) steps in
the process, and
>>>>>>>>> point out which values should be used in which fields
on Google. For
>>>>>>>>> example, I had to grep the code base to find the callback
>>>>>>>>> - The initial login UI seems over-complex: you have to
click the provider
>>>>>>>>> icon, and then click either login or register. The standard
for this
>>>>>>>>> workflow is that you login by clicking the desired provider's
icon, and
>>>>>>>>> doing so will register you automatically if you aren't
already. In my case
>>>>>>>>> I only have one provider, so this menu was even more
>>>>>>>>> - It was not clear to me that the "Public" role has absolutely
>>>>>>>>> permissions. When I set this as the default role and
registered, I could no
>>>>>>>>> longer access the site until I cleared cookies. I thought
it was an OAuth
>>>>>>>>> error at first, but it turns out the Public role has
fewer effective
>>>>>>>>> permissions than an anonymous user; this resulted in
a redirect loop
>>>>>>>>> because I could not even view the homepage. I had to
correct this in the
>>>>>>>>> database to be able to log in.
>>>>>>>>> - The roles list (at roles/list/ ) is intimidatingly
large and hard to
>>>>>>>>> parse. For instance, I couldn't tell at a glance what
"user" allows
>>>>>>>>> relative to "viewer". It would be good to have a narrative
description of
>>>>>>>>> what each of these roles is intended for, and to present
the list of
>>>>>>>>> permissions in a more clustered or diffable way. Permissions
lists tend to
>>>>>>>>> only grow, after all.
>>>>>>>>> - A "Viewer" currently lacks enough access to see their
own profile.
>>>>>>>>> - "User Statistics" (userstatschartview/chart/) uses
the internal name,
>>>>>>>>> rather than firstname/lastname - which in my case is
a `google_idnumber`
>>>>>>>>> name. Should probably show both names.
>>>>>>>>> Unrelatedly to RBAC (I think), on this branch on my sandbox
instance, tasks
>>>>>>>>> appear to be failing with the only logs present in the
UI as:
>>>>>>>>> [{'end_of_log': True}, {'end_of_log': True}, {'end_of_log':
>>>>>>>>> {'end_of_log': True}, {'end_of_log': True}, {'end_of_log':
>>>>>>>>> Finally, in case anyone else wanted to test run a similar
setup, here is
>>>>>>>>> the that I ended up using (note that
it has Jinja
>>>>>>>>> templating via Ansible):
>>>>>>>>> import os
>>>>>>>>> from airflow import configuration as conf
>>>>>>>>> from import AUTH_OAUTH
>>>>>>>>> basedir = os.path.abspath(os.path.dirname(__file__))
>>>>>>>>> # The SQLAlchemy connection string.
>>>>>>>>> SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')
>>>>>>>>> # Flask-WTF flag for CSRF
>>>>>>>>> CSRF_ENABLED = True
>>>>>>>>> # The name to display, e.g. "Airflow Staging Sandbox"
>>>>>>>>> APP_NAME = "Airflow {{ env }} {{ app_config | capitalize
>>>>>>>>> # Use OAuth
>>>>>>>>> # Will allow user self registration
>>>>>>>>> # The default user self registration role
>>>>>>>>> AUTH_USER_REGISTRATION_ROLE = "{{ airflow_rbac_registration_role
>>>>>>>>> default('Viewer') }}"
>>>>>>>>> # Google OAuth:
>>>>>>>>> OAUTH_PROVIDERS = [{
>>>>>>>>> # The name of the provider
>>>>>>>>> 'name': 'google',
>>>>>>>>> # The icon to use
>>>>>>>>> 'icon': 'fa-google',
>>>>>>>>> # The name of the key that the provider sends
>>>>>>>>> 'token_key': 'access_token',
>>>>>>>>> # Just in case, whitelist to only emails
>>>>>>>>> 'whitelist': [''],
>>>>>>>>> # Define the remote app:
>>>>>>>>> 'remote_app': {
>>>>>>>>> 'base_url': '',
>>>>>>>>> 'access_token_url': '',
>>>>>>>>> 'authorize_url': '',
>>>>>>>>> 'request_token_url': None,
>>>>>>>>> 'request_token_params': {
>>>>>>>>> # Uses the Google+ API, requestingf the 'email' and 'profile'
>>>>>>>>> 'scope': 'email profile'
>>>>>>>>> },
>>>>>>>>> 'consumer_key': '{{ vault_airflow_google_oauth_key }}',
>>>>>>>>> 'consumer_secret': '{{ vault_airflow_google_oauth_secret
>>>>>>>>> }
>>>>>>>>> }]
>>>>>>>>> On Mon, Apr 30, 2018 at 12:54 PM, Jørn A Hansen <>
>>>>>>>>> wrote:
>>>>>>>>>> On Mon, 30 Apr 2018 at 15.56, James Meickle <>
>>>>>>>>>> wrote:
>>>>>>>>>>> Installed this off of the branch, and I do get
the Kubernetes executor
>>>>>>>>>>> (incl. demo DAG) and some bug fixes - but I don't
see any RBAC feature
>>>>>>>>>>> anywhere I'd think to look. Do I need to set
up some config to get that
>>>>>>>>>> to
>>>>>>>>>>> show up?
>>>>>>>>>> See
>>>>>>>>>> test/
>>>>>>>>>> It had me left wondering as well - so I decided to
go hunt for it in the
>>>>>>>>>> RBAC PR. And there it was :-)
>>>>>>>>>> Cheers,
>>>>>>>>>> JornH
>>>>>>>>>>> On Mon, Apr 23, 2018 at 2:06 PM, Bolke de Bruin
>>>>>>>>>> wrote:
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>> I am really happy that Fokko and I have created
the v1-10-test branch
>>>>>>>>>> and
>>>>>>>>>>>> subsequently build the first beta of Apache
Airflow 1.10!
>>>>>>>>>>>> It is available for testing here:
>>>>>>>>>>>> Highlights include:
>>>>>>>>>>>> * New RBAC web interface in beta
>>>>>>>>>>>> * Timezone support
>>>>>>>>>>>> * First class kubernetes operator
>>>>>>>>>>>> * Experimental kubernetes executor
>>>>>>>>>>>> * Documentation improvements
>>>>>>>>>>>> * Performance optimizations for large DAGs
>>>>>>>>>>>> * many GCP and S3 integration improvements
>>>>>>>>>>>> * many new operators
>>>>>>>>>>>> * many many many bug fixes
>>>>>>>>>>>> We are aiming for a fully compliant Apache
release so we should be able
>>>>>>>>>>> to
>>>>>>>>>>>> kick off the graduation process after this
release. I hope you help us
>>>>>>>>>>> out
>>>>>>>>>>>> getting there!
>>>>>>>>>>>> Kind regards,
>>>>>>>>>>>> Bolke & Fokko

View raw message