hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nauroth <cnaur...@hortonworks.com>
Subject Re: [DISCUSS] Developing features in branches
Date Thu, 30 Apr 2015 22:46:25 GMT
In HDFS, our recent feature branches tried to keep large portions of their
new code in new classes (i.e.
org.apache.hadoop.hdfs.server.namenode.CacheManager) or even new Java
packages (i.e. org.apache.hadoop.hdfs.server.namenode.snapshot).  We tried
to make minimal changes in existing code: just enough to hook into the new
code.  If hooking into the new code isn't easy for some reason, then
sometimes you can submit a non-impactful refactoring patch to trunk to
help make it easier.  By submitting straightforward refactorings to trunk
first, you can reduce some of the difficulty of reviewing a large
consolidated patch at merge time.  Reviewers can focus on the new logic.

This tends to minimize the impact of merge conflicts coming from either
trunk or a sibling feature branch.  This is only possible if it's a
logically distinct new feature and this kind of code organization makes
sense for that feature, but it's something to keep in mind.

--Chris Nauroth

On 4/30/15, 3:23 PM, "Zhijie Shen" <zshen@hortonworks.com> wrote:

>Exactly. Branch development is good, but I concerned about too many
>concurrent branches. In terms of code management, the good branch
>development candidate could be those like registry, shared cache and
>timeline service. Their most changes are the incremental code in some new
>sub-module, are less likely to conflict with trunk/branch-2, and are
>rarely depended by other parallel development.
>From: Bikas Saha <bikas@hortonworks.com>
>Sent: Thursday, April 30, 2015 12:52 PM
>To: yarn-dev@hadoop.apache.org
>Subject: RE: [DISCUSS] Developing features in branches
>I think what Zhijie is talking about is a little different. Work
>happening in parallel across 2 branches have no clue about each other
>since they donĀ¹t get updates via master. If a bunch of these branches is
>tried to be merged close to a release then there are likely to be a lot
>of surprises. As an example, lets say support for speculation and node
>labels were happening in separate branches. It is very likely that >50%
>of the code would conflict - not just in code but also in semantics.
>-----Original Message-----
>From: Ray Chiang [mailto:rchiang@cloudera.com]
>Sent: Thursday, April 30, 2015 10:35 AM
>To: yarn-dev@hadoop.apache.org
>Subject: Re: [DISCUSS] Developing features in branches
>Following up on Zhijie's comments, there's nothing to prevent
>periodically pulling updates from the "main" branch (e.g. branch-2 or
>trunk) into the feature branch, is there?  Or cherry-picking some changes
>to alleviate conflict management during branch merging?
>I've seen other projects use one of the two techniques above.
>On Wed, Apr 29, 2015 at 9:43 PM, Zhijie Shen <zshen@hortonworks.com>
>> My 2 cents:
>> Branch maintenance cost should be fine if we have few features to be
>> developed in branches. However, if there're too many, each other
>> branch may be blind to most of latest code change from others, and
>> trunk/branch-2 becomes stale. That said, with the increasing adopting
>> of branch development, it's likely to increase the cost of merging each
>>branch back.
>> Some features may last more than one releases, such as RM restarting
>> before and timeline service now. Even if it's developed in a branch,
>> we may want to merge its milestones such as phase 1, phase 2 back to
>> trunk/branch-2 to align with some release before it's completely done.
>> Moreover, my experience is that the longer a feature stays in the
>> branch, the more conflicts we have to merge. Hence, it may not be a
>> good idea to hold a feature in the branch too long before merging it
>> Thanks,
>> Zhijie
>> ________________________________________
>> From: Subramaniam V K <subru.vk@gmail.com>
>> Sent: Wednesday, April 29, 2015 7:16 PM
>> To: yarn-dev@hadoop.apache.org
>> Subject: Re: [DISCUSS] Developing features in branches
>> Karthik, thanks for starting the thread.
>> Here's my $0.02 based on the experience of working on a feature branch
>> while adding reservations (YARN-1051).
>> Overall a +1 for the approach.
>> The couple of pain points we faced were:
>> 1) Merge cost with trunk
>> 2) Lack of CI in the feature branch
>> The migration to git & keeping the feature branch in continuous sync
>> with trunk mitigated (1) and with Allen's new test-patch.sh addressing
>> (2), branches for features especially if used for all major features
>> seems like an excellent choice.
>> -Subru
>> On Tue, Apr 28, 2015 at 5:47 PM, Sangjin Lee <sjlee0@gmail.com> wrote:
>> > Ah, I missed that part (obviously). Fantastic!
>> >
>> > On Tue, Apr 28, 2015 at 5:31 PM, Sean Busbey <busbey@cloudera.com>
>> wrote:
>> >
>> > > On Apr 28, 2015 5:59 PM, "Sangjin Lee" <sjlee0@gmail.com> wrote:
>> > > >
>> > >
>> > > > That said, in a way we're deferring the cost of cleaning things
>> > > > up
>> > > towards
>> > > > the end of the branch. For example, we don't get the same
>> > > > treatment
>> of
>> > > the
>> > > > hadoop jenkins in a branch development. It's left up to the
>> > > > group or
>> > the
>> > > > individuals to make sure to run test-patch.sh to ensure tech
>> > > > debt
>> does
>> > > not
>> > > > accumulate.
>> > >
>> > > As Allen previously mentioned, the QA bot will run test-patch
>> > > against feature branches so long as you name the patch file
>> > >
>> >

View raw message