hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hammerbacher <ham...@cloudera.com>
Subject Re: [DISCUSSION] Proposal for making core Hadoop changes
Date Mon, 31 May 2010 17:16:50 GMT
A far more lightweight example of multi-issue feature planning in an open
source project comes from Drizzle and their "blueprints":

Each "spec" has a drafter, an approver, and an assignee; declares the other
specs on which it depends; points to the relevant branches in the source
tree and issues in the issue tracker; and has a priority, definition state,
and implementation state.

I don't know how it's working out for them in practice, but on paper it
looks quite nice.

On Wed, May 26, 2010 at 9:13 AM, Eli Collins <eli@cloudera.com> wrote:

> > No, but I'd estimate the cost of merging at 1-2 days work a week just to
> > pull in the code *and identify why the tests are failing*. Git may be
> better
> > at merging in changes, but if Hadoop doesn't work on my machine after the
> > merge, I need to identify whether its my code, the merged code, some
> machine
> > quirk, etc. It's the testing that is the problem for me, not the
> > merge effort. That's the Hadoop own tests any my own functional test
> suites,
> > the ones that bring up clusters and push work through. Those are the
> > troublespots, as they do things that hadoop's own tests don't do, like as
> > for all the JSP pages.
> I've lived off a git branch of common/hdfs for half a year with a big
> uncommitted patch, it's no where near 1-2 days of effort per week to
> merge in changes from trunk. If the tests are passing on trunk, and
> they fail after your merge then those are real test failures due to
> your change (and therefore should require effort). The issues with
> your internal tests failing due to changes on trunk is the same
> whether you merge or you just do an update - you have to update before
> checking in the patch anyway - so that issue is about the state of
> trunk when you merge or update, rather than about being on a branch.
> >
> >> Might find the
> >> following interesting:
> >> http://incubator.apache.org/learn/rules-for-revolutionaries.html
> >
> > There's a long story behind JDD's paper, I'm glad you have read it, it
> does
> > lay out what is effectively the ASF process for effecting significant
> change
> > -but it doesn't imply that's the only process for having changes.
> >
> Just to be clear I don't mean imply that branches are the only process
> for making changes. Interesting that this is considered the effective
> ASF process, it hasn't seemed to me that recent big features on hadoop
> have used it, only one I'm aware of that was done on a branch was
> append.
> > I think gradual evolution in trunk is good, it lets people play with
> what's
> > coming in. Having lots of separate branches and everyone's private
> release
> > being a merge of many patches that you choose is bad.
> Agreed.  Personally I don't think people should release from branches.
> And in practice I don't think you'll see lots of branches, people can
> and would still develop on trunk. Getting changes merged from a branch
> back to trunk before the whole branch is merged is a good thing, the
> whole branch may never be merged and that's OK too. Branches are a
> mechanism, releases are policy.
> Thanks,
> Eli

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message