hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Shaposhnik <...@apache.org>
Subject Re: Streamlining the Hadoop release process
Date Thu, 25 Apr 2013 15:44:37 GMT
On Wed, Apr 24, 2013 at 8:17 PM, Konstantin Shvachko
<shv.hadoop@gmail.com> wrote:
> There was and is a number of discussions about Hadoop version
> compatibility, feature porting, stability. I think that many problems of
> Hadoop are the result of our flawed release processes and can be solved by
> streamlining the releases.

In my opinion this would be an extremely useful thing, especially
when it comes to us getting the maximum value of the downstream
projects providing feedback. Here's my favorite case in point:
    https://git-wip-us.apache.org/repos/asf?p=giraph.git;a=blob;f=pom.xml;h=6d2bdd1cf2399db119ab50e3dfb7a825b2930691;hb=HEAD

Just count how many Hadoop 2.X'ish profiles there are.

> This destabilizes the release canceling former efforts to fix bugs and
> provide working environment for the upstream projects. I mean stabilizing
> and adding features are mutually exclusive activities. This is in part why
> Hadoop 2 stabilization effort is perpetual.

I have a lot of sympathy for this (hence me starting that other thread).

At this point, I completely agree with you that release early, release often
would work well for Hadoop. I guess what I'm trying to say is that instead
of constantly trying to cram every last feature into branch-2 we should pause
and decide where we draw the line and say that the new development
from now on will be known as hadoop 3.X (or something) and we're willing
to spend time to get to our *first* ever officially stable Hadoop 2.X. If for
nothing else but to benefit the downstream developers.

> My practical suggestions are:
> 1. Produce a series of feature releases to catch up branch-2 with trunk.
>    We can prioritize features in general or let the release manager to
> decide which feature to pick up from trunk.
>    Version numbering is also up for discussion. I would call them 2.x and
> reserve the minor numbers for subsequent stabilization bug-fix-releases.
> 2. Build new features in dev-branches until they are done. We do it now,
> but should enforce more.

Here's a practical question: suppose we do the above. IOW, from now on
all features will be strictly confined to release branches. How long do you
think will it take to unload things from trunk into branch-2 ? What's your
personal estimate?

> P.S. This is not to preempt discussion on stabilizing 2.0.5 started by
> Roman. I am just not sure why we call it stabilization and 2.0.5 and betta,
> if incompatible and new features has already been committed to the branch.
> BTW as an illustration to my observations above.

I believe the two are related -- basically what I'm trying to address in that
other thread is getting us to a point where we can at least agree on
the *criteria* for what beta means. Your proposal, in fact, layed out a few
things that seem to add up to that criteria (e.g. catching up with trunk).

I really wish other would chime in, since without this type of understanding
it is really difficult to make forward progress on the downstream front.

Thanks,
Roman.

Mime
View raw message