hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <steve.lough...@gmail.com>
Subject Re: Large feature development
Date Sun, 02 Sep 2012 14:58:14 GMT
On 1 September 2012 09:20, Todd Lipcon <todd@cloudera.com> wrote:

> Thanks for starting this thread, Steve. I think your points below are
> good. I've snipped most of your comment and will reply inline to one
> bit below:
> On Fri, Aug 31, 2012 at 10:07 AM, Steve Loughran
> <steve.loughran@gmail.com> wrote:
> >
> > How then do we get (a) more dev projects working and integrated by the
> > current committers, and (b) a process in which people who are not yet
> > contributors/committers can develop non-trivial changes to the project
> in a
> > way that it is done with the knowledge, support and mentorship of the
> rest
> > of the community?
Both HDFS2 and MRv2 are in trunk, therefore I consider them successes.

> Here's one proposal, making use of git as an easy way to allow
> non-committers to "commit" code while still tracking development in
> the usual places:

This is effectively what people do. I'm less worried about the code side of
things than the integration and mentoring

> - Upon anyone's request, we create a new "Version" tag in JIRA.

-1. There are enough versions. There is a "tag" field in JIRA for precisely
this purpose

> - The developers create an umbrella JIRA for the project, and file the
> individual work items as subtasks (either up front, or as they are
> developed if using a more iterative model)

as today

> - On the umbrella, they add a pointer to a git branch to be used as
> the staging area for the branch. As they develop each subtask, they
> can use the JIRA to discuss the development like they would with a
> normally committed JIRA, but when they feel it is ready to go (not
> requiring a +1 from any committer) they commit to their git branch
> instead of the SVN repo.

some integration w/ jenkins and pull testing would be good here

> - When the branch is ready to merge, they can call a merge vote, which
> requires +1 from 3 committers, same as a branch being proposed by an
> existing committer. A committer would then use git-svn to merge their
> branch commit-by-commit, or if it is less extensive, simply generate a
> single big patch to commit into SVN.
> My thinking is that this would provide a low-friction way for people
> to collaborate with the community and develop in the open, without
> having to work closely with any committer to review every individual
> subtask.
> Another alternative, if people are reluctant to use git, would be to
> add a "sandbox/" repository inside our SVN, and hand out commit bit to
> branches inside there without any PMC vote. Anyone interested in
> contributing could request a branch in the sandbox, and be granted
> access as soon as they get an apache SVN account.
I don't see the technical issues with how the merge is done as the main

The barriers to getting your stuff in are
1. getting people to care enough to help develop the feature -mentorship,
collaborative development.
2. getting incremental parts in to avoid the continual
merge-regression-test hell that you go through if you are trying to keep a
separate branch alive. It's not the technical aspects of the merge so much
as the need to run all the hadoop tests and your own test suite, and track
down whether a failure is a regression in -trunk or something in your code.

Jun's patch is an example of this situation. We haven't seen the effort he
and his colleagues have done with merge and test, but I'm confident it's
been there. What they now have is a "big bang" class of patch which is so
big that anyone reviewing it would have to spend a couple of weeks going
through the codebase trying to understand it. Which as we all know means
two weeks not doing all the things you are committed to doing.

We know it's there, we know it's current -so how to use this as an exercise
in something to pull in incrementally?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message