hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: abandoning 22 - was: Content request for 0.20.205 Sustaining Release
Date Fri, 09 Sep 2011 17:03:07 GMT
On Fri, Sep 9, 2011 at 3:40 AM, Steve Loughran <stevel@apache.org> wrote:
> On 07/09/11 10:04, Konstantin Shvachko wrote:
>>
>> Eric,
>>
>> It would take the same amount of resources to fix 0.22 as to merge
>> append and security branches aka 0.20.205.
>> Although I understand that Hortonworks needs to support its
>> customer(s) and is eager to bridge the gap in functionality with its
>> competitor(s), I think continuing with 0.20
>> a-three-years-old-technology is not the best place to invest
>> resources. In the past you advocated for 0.21 and 0.22, both now
>> abandoned by your team(s) in favor of enhancing 0.20. It will be sad
>> to see this backward/forward porting going on forever, diverging the
>> Apache Hadoop project from natural evolutionary process.
>>
>> I think 0.22 has all the functionality required to run Hadoop for most
>> production tasks. I see enough momentum and involvement in the
>> community with 0.22 testing. I think there will be enough resources to
>> get it stabilized in near future.
>
> this is interesting.
>
> 1. I've been doing some 0.20.x work and hitting bugs that I know have been
> fixed in trunk a while ago but never backported as they were things that
> weren't critical enough to the people using the 20.x branches (i.e problems
> related to my home network, issues w/ embedding the JARs, etc).
>
> This is why I have to disagree with eric14's "age is irrelevant" claim. The
> APIs show their age, so do other quirks. It's just a known set of quirks
> -like WindowsXP is today.
>
> 2. 0.23 will take a while to stabilise; a big barrier is that projects on
> top of hadoop need to test it. Bigtop can help here, but it will still take
> time.
>
> 3. Where is all maintenance of MR 1.0 code going to go? It can't go in trunk
> or 0.23, as that's on MR2.x. Should all changes to MR1.0 be backported to
> 0.20.x, and new stuff put in there?

MR1 is being maintained on 20x. In fact 20x is the only MR1 code that
supports security and disk failure handling. The MR1 code in 22 is a
regression in some significant aspects (features, performance, bugs)
from the latest stable MR1 (204).

You might find this discussion relevant:

http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201107.mbox/%3CCAPn_vTsdiiqfCB2G0HfsOr3W_4PKoocPcTf2VB93Y3MZrzRczQ@mail.gmail.com%3E

http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-dev/201107.mbox/%3CCAPn_vTsdiiqfCB2G0HfsOr3W_4PKoocPcTf2VB93Y3MZrzRczQ@mail.gmail.com%3E

>
> Or are we declaring a complete block on all upgrade paths that don't involve
> a migration to 0.23 or staying on the 0.20.x branch -with only a subset of
> fixes and an aging API- available?

My understanding of the way Apache operates is that you can't do
things like "declare blocks on upgrade paths". People can try to
release updates to 21 or 22 (or some new tree). Ie the decisions are
made implicitly by where people invest cycles.

Thanks,
Eli

>
> -Steve
>

Mime
View raw message