hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@altiscale.com>
Subject Re: Looking to a Hadoop 3 release
Date Tue, 03 Mar 2015 19:18:03 GMT
Between:

	* removing -finalize
	* breaking HDFS browsing
	* changing du’s output (in the 2.7 branch)
	* changing various names of metrics (either intentionally or otherwise)
	* changing the JDK release

	… and probably lots of other stuff in branch-2 I haven’t seen/know about, our best course
of action is to:

$ git rm hadoop-common-project/hadoop-common/src/site/markdown/Compatibility.md

	At least this way we as caretakers don’t come across as hypocrits.  It’s pretty clear
the direction has shown we only care about API compatibility and the rest is ignored when
it isn’t “convenient”.  [The next time someone tells you that Hadoop is hard to operate,
I want you think about this email.]  (1)

	Making 2.7 build with JDK7 led to the *exact* situation I figured it would:  now we have
a precedent where we just say to the community “You know those guarantees?  Yeah, you might
as well ignore them because we’re going to change the core component any damn time we feel
like it.”

	We haven’t made a release branch off of trunk since branch-0.23.  If anyone thinks that’s
healthy, there is some beach property in Alberta you might be interested in as well. Our release
cycle came to a screeching halt after 0.20 and we’ve never recovered.

	However, I offer an alternative.

	This same circular argument comes up all the time: (2)

	* There aren’t enough changes in trunk to make a new branch. 
	* We can’t upgrade/change component X because there is no plan to make a new major release.

	To quote Frozen:  Let It Go

	We’re probably at the point where there aren’t likely to be very many more earth shattering
changes to the Hadoop code base.  The community has decided instead to push these types of
changes as separate projects via incubator to avoid the committer paralysis that this community
suffers.  

	Because of this, I don’t think the “enough changes” argument works anymore.  Instead,
we need to pick a new metric to build a cadence to force regular updates.  I’d offer that
the “every two years” JDK EOL sets the perfect cadence, matched by many other enterprise
and OSS software, and gives us an opportunity to reflect in the version number that the critical
component of our software has changed.

	This cadence allows for people to plan appropriately and know what our roadmap and direction
actually is.  Folks are more likely to build “real” solutions rather than make compromises
that suffer in quality in the name of compatibility simply because they don’t know when
their work will actually show up. We’ll have a normal, regular opportunity to update dependencies
(regardless of the state of HADOOP-11656).

	Now, if you’ll excuse me, I have more contributor's patches to go through.

(1) FWIW, I made the decision not to worry about backward compatibility in the shell code
rewrite when I made the realization that the jsvc log and pid file names were poorly chosen
to allow for certain capabilities.  Did anyone actually touch them from outside the software?
Probably not.  But it is still effectively an interface, so off to trunk it went. 

(2) … and that’s before we even get to the “Version numbers are cheap” arguments that
were made during the Great Renames of 0.20 and 0.23.
Mime
View raw message