hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [DISCUSSION] Proposal for making core Hadoop changes
Date Wed, 26 May 2010 16:13:31 GMT
> No, but I'd estimate the cost of merging at 1-2 days work a week just to
> pull in the code *and identify why the tests are failing*. Git may be better
> at merging in changes, but if Hadoop doesn't work on my machine after the
> merge, I need to identify whether its my code, the merged code, some machine
> quirk, etc. It's the testing that is the problem for me, not the
> merge effort. That's the Hadoop own tests any my own functional test suites,
> the ones that bring up clusters and push work through. Those are the
> troublespots, as they do things that hadoop's own tests don't do, like as
> for all the JSP pages.

I've lived off a git branch of common/hdfs for half a year with a big
uncommitted patch, it's no where near 1-2 days of effort per week to
merge in changes from trunk. If the tests are passing on trunk, and
they fail after your merge then those are real test failures due to
your change (and therefore should require effort). The issues with
your internal tests failing due to changes on trunk is the same
whether you merge or you just do an update - you have to update before
checking in the patch anyway - so that issue is about the state of
trunk when you merge or update, rather than about being on a branch.

>> Might find the
>> following interesting:
>> http://incubator.apache.org/learn/rules-for-revolutionaries.html
> There's a long story behind JDD's paper, I'm glad you have read it, it does
> lay out what is effectively the ASF process for effecting significant change
> -but it doesn't imply that's the only process for having changes.

Just to be clear I don't mean imply that branches are the only process
for making changes. Interesting that this is considered the effective
ASF process, it hasn't seemed to me that recent big features on hadoop
have used it, only one I'm aware of that was done on a branch was

> I think gradual evolution in trunk is good, it lets people play with what's
> coming in. Having lots of separate branches and everyone's private release
> being a merge of many patches that you choose is bad.

Agreed.  Personally I don't think people should release from branches.
And in practice I don't think you'll see lots of branches, people can
and would still develop on trunk. Getting changes merged from a branch
back to trunk before the whole branch is merged is a good thing, the
whole branch may never be merged and that's OK too. Branches are a
mechanism, releases are policy.


View raw message