hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Branch for HDFS-1073 and related work
Date Wed, 30 Mar 2011 01:51:29 GMT
Hey folks,

There's been a bit of off-list discussion about this, so I wanted to
clarify a few points about the development style for this branch,
since I think there was some confusion.

First, a little bit about the motivation for following something
closer to CTR rather than RTC on the branch:

As mentioned by Konstantin and Jakob both, the HDFS-1073 project is
very complicated. To me, this means the following things:

- The changes should be reviewed by the experts in this area of the
code (in particular Konstantin, Sanjay, and Dhruba have done a lot of
work in this area, though others of course know it well too)
- The changes should be split into small patches to make the reviews
manageable. Last October I had a working prototype of the project
(albeit with a slightly different design) and it was a 200+KB patch.
It's unreasonable to expect someone to review that (in fact no one
did).

Trying to make progress on this project on the main branch over the
last several months has shown that it's very hard to achieve both of
the above, for the following reasons:

- The experts on this area of the code are very busy, so review
turnaround can be 3-4 days.
- Splitting into small patches limits how far "ahead of reviews" I can
get on development. I've been working on this on a git branch to try
to "stay ahead", but found that responding to even small review
feedback on patch #1 in a series of 8 causes patch conflicts on the
remaining 7. Rather than spending most of my time actually making
progress, the cycle has been:

1. post patch
2. wait 3 days for review
3. page back in the project, respond with simple changes (5 minutes of
actual work),
4. spend several hours rebasing remaining patches on top of changed base patch
5. rinse and repeat

Proposing a modified CTR process is an effort to cut step 2 down, and
hopefully move the "development wave-front" to be just ahead of the
reviewer, so that step 4 is completely removed.

Of course I understand that people are busy and may not get to review
every patch within a day of being posted (I myself am probably
guiltier than most on this). So, people are encouraged to review
patches even after they've been put on the branch, and the review
feedback will be addressed prior to the merge. Addressing review
feedback in a separate followup commit means that it won't invalidate
the patch series in between the two, and I think will speed up
development a lot.


To put this another way, I'm neither proposing Commit-Then-Review or
Review-Then-Commit, but rather Stage-Then-Review-Then-Merge. The
branch can be seen as a "staging ground" for a patch series eventually
targeted at trunk, just like many people use local git branches today.
The difference here is that it's making use of ASF infrastructure like
SVN and JIRA so that review and history is easier. Additionally, prior
to merge, it will be easier to do some QA since there will be an SVN
branch to point folks to.

I understand this breaks with "tradition" a bit, but we need to get
this project done, and the way it was going before just wasn't
progressing fast enough. Let's try this as an experiment, and if it
fails miserably, or the other committers don't trust the code quality,
we can always vote not to merge it into trunk.

Thanks
-Todd

On Mon, Mar 28, 2011 at 11:18 PM, Todd Lipcon <todd@cloudera.com> wrote:
> I've created this branch "HDFS-1073" as discussed. In this branch there's a
> new CHANGES.HDFS-1073.txt file -- please add entries into this file rather
> than CHANGES.txt if you commit stuff to the branch. This will simplify our
> merges later on, I think. We'll transfer the changes back into CHANGES.txt
> and delete this file when we do the final merge into trunk.
> I've also created an associated Fix Version on JIRA.
>
> Regarding review process, turnaround time in reviews has made this work
> progress slower than I'd like. For example, HDFS-1538 has had patches
> available with no comments for several months. Of course, I've had a few
> bouts where I've been quite slow to respond to feedback as well -- sorry for
> that. This is now my top priority so I should be making rapid progress on
> development from here on out.
> I agree with Jakob that this is tricky code and deserves thorough review,
> and also that it's easier to review small patches than a giant merge. So, I
> will make sure that the work going into the branch is associated with
> coherent changes, each associated with a JIRA, to aid reviewers in
> understanding the changes so much as is practical. But, I think something
> closer to commit-then-review will allow us to make progress much faster. As
> a compromise, I'll be following the following policy:
> - A patch will be uploaded to the JIRA for review like usual
> - If another committer provides a +1, it may be committed at that point,
> just like usual.
> - If no committer provides +1 (or a review asking for changes) within 24
> business hours, it will be committed to the branch under "commit then
> review" policy.
> Of course if any committer feels that code needs to be amended, he or she
> should feel free to open a new JIRA against the branch including the review
> comments, and they will be addressed before the merge into trunk. And just
> like with any branch merge, ample time will be given for the community to
> review both the large merge commit as well as the individual historical
> commits of the branch, before it goes into trunk.
> Thanks!
> -Todd
> On Mon, Mar 28, 2011 at 9:24 PM, Konstantin Shvachko <shv.hadoop@gmail.com>
> wrote:
>>
>> +1 for creating a branch.
>> Agree with Jakob this should not mean less intensive reviewing.
>> --Konstantin
>>
>> On Mon, Mar 28, 2011 at 4:39 PM, Jakob Homan <jghoman@gmail.com> wrote:
>>
>> > Doing this work on a branch makes sense.  +1.
>> >
>> > However, the patches that have been committed so far required
>> > extensive review and revision, and in one case, an addendum patch.
>> > Additionally, the patches in related JIRAs such as HDFS-1557 and
>> > HDFS-1572 have required multiple revisions and corrections.  While
>> > it's up to each committer which +1 they're willing to accept, I don't
>> > see any harm in keeping our current standards for review, considering
>> > the importance and difficulty of this work.  In fact, since moving the
>> > work to a branch will also move it off of many reviewers' radar, it
>> > may even be reasonable to increase the scrutiny.  Reviewing giant
>> > branch merges is difficult and, I believe, more error-prone than
>> > reviewing smaller packets of work.  So far these patches have received
>> > timely reviews, if this does not turn out to be the case, let other
>> > committers know so we can do our part.
>> > -jg
>> >
>> >
>> > On Mon, Mar 28, 2011 at 1:40 PM, Dhruba Borthakur <dhruba@gmail.com>
>> > wrote:
>> > > +1. I think this will be very helpful in moving the design forward
>> > quickly.
>> > >
>> > > -dhruba
>> > >
>> > >
>> > > On Mon, Mar 28, 2011 at 1:14 PM, Todd Lipcon <todd@cloudera.com>
>> > > wrote:
>> > >
>> > >> Hi all,
>> > >>
>> > >> I discussed this with a couple folks over the weekend who are
>> > >> involved
>> > in
>> > >> the project, but wanted to let the dev community at large know:
>> > >>
>> > >> I'm planning on creating a new SVN branch for HDFS-1073 and its
>> > subtasks.
>> > >> For those not aware, HDFS-1073 is a rethinking of how the NN, 2NN,
>> > >> and
>> > >> BackupNode store images and edit logs on disk. This will help make
HA
>> > more
>> > >> manageable down the line and has a lot of operational benefits as
>> > described
>> > >> in the JIRA. The "related work" is the addition of transaction IDs
to
>> > the
>> > >> persistent storage of the NN, and some refactoring in the edit log
>> > >> subsystem.
>> > >>
>> > >> The reasoning behind creating a branch is that, since this is a
>> > >> fairly
>> > >> large
>> > >> change, it is easier to develop through a number of subtasks. But at
>> > some
>> > >> of
>> > >> the intermediate points, various components will be temporarily
>> > >> broken.
>> > >> Developing on a branch will allow us to make incremental progress
>> > without
>> > >> worrying about keeping all tests green after every change. We will
of
>> > >> course
>> > >> make sure all tests pass before merging back into trunk. There will
>> > >> also
>> > be
>> > >> another opportunity to review before the merge into trunk. This is
>> > >> the
>> > same
>> > >> development methodology as was done for the 0.21 Append work and is
>> > >> now
>> > >> being used for Federation.
>> > >>
>> > >> Given that there will be another opportunity to review these changes
>> > before
>> > >> merging into trunk, I would also like to propose that Ivan Kelly be
>> > granted
>> > >> the ability to "+1" patches on this branch despite not being an HDFS
>> > >> committer. Ivan is actively contributing on this project and
>> > >> understands
>> > >> the
>> > >> code well.
>> > >>
>> > >> Unless there are any objections, I will create this branch and an
>> > >> associated
>> > >> Fix Version on JIRA tonight.
>> > >>
>> > >> Thanks!
>> > >> -Todd
>> > >> --
>> > >> Todd Lipcon
>> > >> Software Engineer, Cloudera
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Connect to me at http://www.facebook.com/dhruba
>> > >
>> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message