hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: [DISCUSS] Apache Hadoop 1.0?
Date Wed, 16 Nov 2011 18:15:14 GMT

On 11/16/11 9:24 AM, "Konstantin Boudnik" <cos@apache.org> wrote:

>On Wed, Nov 16, 2011 at 09:15AM, Doug Cutting wrote:
>> On 11/15/2011 06:06 PM, Konstantin Boudnik wrote:
>> > Are you suggesting to drop 0.22 out of the picture all together? Any
>> > reason for that?
>> By no means.  I thought that we might, as Scott Carey said, treat 0.22
>> as a minor release in the 1.x series.  I'd prefer that we consistently
>> rename branches (0.20.x becomes 1.0.x, 0.21.x becomes 1.1.x, etc.).
>Thanks for the explanation. I see your point in 1.?.x renames. My only
>is that it might suggest that to the users that 1.2.0 (e.g. current 0.22)
>is a
>sort of natural continuation from 1.0.0 (current 0.20.x) and the upgrade
>be easy and automatic. Which isn't necessary the case, IMO.

IMO what is important from the development and maintenance perspective is
the _meaning_ of the
major.minor.patch numbers as described in my previous message.

If a minor version number bump means that it is a superset of the previous
release and is backwards compatible, then that requirement on its own
answers whether 0.22 can become 1.1, or if it must be a 2.0 release.

Whether hadoop starts using a new meaning for major.minor.patch is what is
of interest to me; starting at 1.x.y or 20.x.y or 999.x.y is marketing.

The version number is completely meaningless on its own, pure marketing.
However, if the numbers gain meaning through a clear definition of what
the major.minor.patch numbers signify, then there is meaning and structure
going forward.
The current state of affairs seems to be:
major:  always 0
minor:  potentially big changes; almost always breaks wire compatibility;
occasionally breaks API backwards compatibility
minor:  typically bug fixes only; 'bug fix' not well defined; almost never
breaks API or wire compatibility

I think the community can decide two things independently:

- Should 0.20.20x be renamed 1.0.y ?  (perhaps not, perhaps 0.23 should be
1.0 and the others left alone).
- Should hadoop adopt a new clear definition of major.minor.patch number

example proposal:
* major version number increment: signifies breaks in API backwards
compatibility and/or major architecture overhauls.
* minor version number increment: signifies possible API changes, but
maintains API backwards compatibility.  Wire compatibility may break (see
release notes).  Included functionality is a superset of previous minor
* patch version number increment: signifies a release where all
improvements are fully backwards compatible with the previous patch
version, including wire format.

Any release may contain new features or improvements, provided they don't
break the compatibility rules and the release manager approves of the
inclusion.  It is not worth defining whether a change is a 'bug fix' 'new
feature' or 'improvement' and dictating any rules based on that -- these
can often blur together and can be dealt with on a case by case basis
instead of through version rules.  IMO guiding the meaning of version
numbers by compatibility class makes the most sense.

Whatever the meaning of the numbers turns out to be will dictate whether
releases after a 1.0.x need to be 2.0.x or can be 1.1.x

>Separating them in two major versions won't be sending such a message.
>> We're rapidly falling into the trap of putting too much significance in
>> a version number, seeking some sort of marketing boost by declaring 1.0.
>>  We can sidestep this by simply dropping the leading 0. and henceforth
>> referring to things as 20, 21, 22, etc.  This minimizes confusion, since
>> there's no significant renaming, it gets us around the marketing issue
>> of still being pre-1.0, and it keeps us from putting too much importance
>> into version numbers.
>I guess this might work too.

View raw message