hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Release numbering for branch-2 releases
Date Tue, 05 Feb 2013 04:50:07 GMT
disclaimer, personal opinions only, I just can't be bothered to subscribe
with @apache.org right now.

On 4 February 2013 14:36, Todd Lipcon <todd@cloudera.com> wrote:

> - Quality/completeness: for example, missing docs, buggy UIs, difficult
> setup/install, etc

par for the course. Have you ever used Linux?

> - Safety: for example, potential bugs which may risk data loss

Anything that threatens data loss is a blocker, at least for data you care

> - Stability: for example, potential bugs which may risk uptime

Less critical for most people, though it can cost lots of $$.

> - End-user API compatibility: will user-facing APIs change in this version?
> (affecting those who write MR jobs)

> - Framework-developer API compatibility: will YARN-internal APIs change in
> this version? (affecting those who write non-MR YARN frameworks)

Things aren't stable in 2.x there yet, YARN-117 is on my todo list, and
without that I consider it broken. the ASF haven't shipped a non-alpha
version of this -and I don't think anyone else has made any stability
claims either. That includes CDH 4.x, where YARN was a "play if you want"
feature. Or "wide-alpha", as I viewed it.

> - Binary compatibility: can I continue to use my application (or YARN)
> framework compiled against an old version with this version, without a
> recompile?

This is one thing Computer Science has never addressed fully. The whole of
the entire computing stack has to be considered "best-effort". If there is
one thing we can do here it is hooking up the entire set of OSS apps to the
nightly build, in a nice DAG including things like Cascading, Spring Data
&c, the way Apache Gump did to act as the regression test for Ant (before
Maven broke it)

> - Intra-cluster wire compatibility: can I rolling-upgrade from A to B?

The presence of the 2.0.2 alpha stuff in the field complicates things. I
know you want upgrades, I'm sure others do too, but if that became an
approved version, there's the conflict with the "-1 version supported" rule
of wire compatibility -does it get changed?

> - Client-server wire compatibility: can I use old clients to talk to an
> upgraded cluster?

IMO we should move clients off the intra-cluster protocol, get them on
WebHDFS, the hcat job APIs, and have a hard split between public and
private. That includes distcp. As webhdfs is in 1.x+ that's the one to care

> Depending on the user's expectations and needs, different factors above may
> be significantly more or less important. And different portions of the
> software may have different levels of stability in each of the areas. As
> I've mentioned in previous threads, my experiences supporting production
> Hadoop 1.x and Hadoop 2.x HDFS clusters has led me to believe that 2.x,
> while being "alpha" is significantly less prone to data loss bugs than 1.x
> in Hadoop.

I hope you are right -it's where everything is going.

> But, with some of the changes in the proposed 2.0.3-alpha, it
> wouldn't be wire-protocol-stable.
I don't know of anyone who wanted that, anyone who said "let's create chaos
and confusion", it was just a consequence of fixing things against an alpha

> How can we best devise a scheme that explains the various factors above in
> a more detailed way than one big red warning sticker? What of the above
> factors does the community think would be implied by "GA?"

Let's see

 $ ant -version
> Apache Ant(TM) version 1.9.0alpha compiled on November 12 2012

Yes, Ant says "anything you build locally is an alpha release".

In that context,  it's no different from -SNAPSHOT except it's easier to
field bugreps against, because they are at least replicable; things
downstream can be updated to work with the alpha and test it.

I view beta as the transition to "feature complete: bugs and regression
only", with some triage, "patches that don't cause visible regressions"

Shipping is pretty much bugs only, with serious triage -only the widely
visible things happen after that. Critical integrity and performance merit
new updates.

Security fixes: out of band emergency updates. This is a good reason for
leaving security out of anything: a simpler support model. Unlike Oracle I
don't think security plugins should have side effects other than fix the
security hole.

Maven complicates things as you can't ever undeclare a release there -not
even for security reasons. Its why ops-managed RPM and deb updates are
preferred by ops groups for rolling out new binaries of any form to a pool
of boxes -at the expense of the application having control of its classpath
(ant has some special classpath setup to support OS-based installations,

The way I've always viewed alpha and beta tags in apache projects is this:

   - you don't care about regressions of behaviour from features that
   weren't in the previous full release
   - the way you field all bug reports is say "is it gone from the latest
   release on that branch?" (*)

The big change in Hadoop is the filesystem: nobody want's to lose their
data, so you do need a story to help people migrate from alpha to next
alpha, beta to next beta. What I don't see being needed is

   1. Support for upgrades from, 2.x.x-alpha to anything 3.x-
   2. Freezing changes to the semantics of the user level APIs that weren't
   in the previous version.

I don't want to gratuitously break anything. It's just that releasing stuff
with the alpha tag doesn't mean "here is something that is stable and
supported by having its own branch maintained", it's "please play with this
and tell us what didn't work".


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message