hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: [DISCUSS] Apache Hadoop 1.0?
Date Wed, 16 Nov 2011 19:57:15 GMT
On 11/16/2011 10:15 AM, Scott Carey wrote:
> IMO what is important from the development and maintenance perspective is
> the _meaning_ of the
> major.minor.patch numbers as described in my previous message.
> If a minor version number bump means that it is a superset of the previous
> release and is backwards compatible, then that requirement on its own
> answers whether 0.22 can become 1.1, or if it must be a 2.0 release.
> Whether hadoop starts using a new meaning for major.minor.patch is what is
> of interest to me; starting at 1.x.y or 20.x.y or 999.x.y is marketing.

Scott, this is a great point.  Thanks for making it.

> The version number is completely meaningless on its own, pure marketing.
> However, if the numbers gain meaning through a clear definition of what
> the major.minor.patch numbers signify, then there is meaning and structure
> going forward.
> The current state of affairs seems to be:
> major:  always 0
> minor:  potentially big changes; almost always breaks wire compatibility;
> occasionally breaks API backwards compatibility
> minor:  typically bug fixes only; 'bug fix' not well defined; almost never
> breaks API or wire compatibility

Long ago I proposed such rules for Hadoop releases at:


These state that pre-1.0 releases behave roughly as above.

> I think the community can decide two things independently:
> - Should 0.20.20x be renamed 1.0.y ?  (perhaps not, perhaps 0.23 should be
> 1.0 and the others left alone).
> - Should hadoop adopt a new clear definition of major.minor.patch number
> significance?

Would you care to call a vote on one or both of these?

> example proposal:
> * major version number increment: signifies breaks in API backwards
> compatibility and/or major architecture overhauls.
> * minor version number increment: signifies possible API changes, but
> maintains API backwards compatibility.  Wire compatibility may break (see
> release notes).  Included functionality is a superset of previous minor
> release.
> * patch version number increment: signifies a release where all
> improvements are fully backwards compatible with the previous patch
> version, including wire format.

This is also similar to what the Roadmap wiki page indicates for
post-1.0 releases.

Renaming things after the fact to try to make them consistent when the
prior rules weren't consistently followed is not easy.  Instead we might
better focus on rules that we intend to obey for releases going forward
and then obey them.

> Whatever the meaning of the numbers turns out to be will dictate whether
> releases after a 1.0.x need to be 2.0.x or can be 1.1.x

Good point.  The most accurate approach would probably be to call each
existing branch a distinct major release.  Dropping the leading zero
would reduce confusion and avoid marketing but would still combine
0.20.x and 0.20.20x which perhaps ought to be considered separate major
releases.  For me this is however a reasonable tradeoff since we're
better off focusing on improving things in the future than arguing about
marketing and how to hide our past versioning mistakes.


View raw message