Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: <7AEF2640-75AA-4688-8788-70234915916A@gmail.com>
References: 
 <CA+RK=_D1YtO9s4KrWLPdyj0aiWd2ZReJD-HfJbSrafmbZ98O-w@mail.gmail.com>
 <CA+RK=_Agd9TgpWOap6eCzFdbktu8P=aDDf2gb8J0VFjbSAZChA@mail.gmail.com>
 <CADY20s5xZ3GXrG62w4m8bq0JsepZO1Ajm18cDMZhAjnDEPqJ4w@mail.gmail.com>
 <CADfYSXQ5TvUFiFT4m6APKJY91L2+ZnsmUr304H736_9mA5KZ+g@mail.gmail.com>
 <CAN5cbe4PPpKZPR4M6ynrr07gZLohMQb9UfbTJsHN21f0rQN4Tg@mail.gmail.com>
 <D8BA4D47-DB49-48E0-872B-789B43AE54F3@gmail.com>
 <CAHxLZBUrcwHk-+__VYPOsC2anXgYUzdB7StbRT1U0fFHop8vZg@mail.gmail.com>
 <E5AEA35C-2122-4196-A68A-1EE18B179792@gmail.com>
 <CA+RK=_D7H9voDwiTKeYV_GZkjynD3Oni_06AKzJ5Y2XGjv1DAA@mail.gmail.com>
 <CADcMMgHpQLBB+WYWUqUPFFo-xA-RZ6NQLdem8VZbMF8+6xKGZw@mail.gmail.com>
 <7AEF2640-75AA-4688-8788-70234915916A@gmail.com>
From: Jonathan Hsieh <jon@cloudera.com>
Date: Mon, 10 Aug 2015 09:53:37 -0700
Message-ID: 
 <CAAha9a0t4ZE7GbriwQm-TMthE4A=BUJja=F1-4QtRvyF7wdAxg@mail.gmail.com>
Subject: Re: [DISCUSSION] Switching from RTC to CTR
To: "private@hbase.apache.org" <private@hbase.apache.org>
Cc: HBase Dev List <dev@hbase.apache.org>
Content-Type: multipart/alternative; boundary=001a11c3675eee5cda051cf7d4ed

--001a11c3675eee5cda051cf7d4ed
Content-Type: text/plain; charset=UTF-8

Sorry to digress from the title main topic but is the concern is about
quality of the system as code enters the system or about trust?

I lean towards "trust but verify".  I'd actually argue that we should have
more infrastructure in place to catch the bugs beyond just our unit test
infrastructure.  Ideally our integration tests would be run frequently
upstream on our mainline branches.   With those in place we could find
quality issues sooner rather than later.  With integration test run infra
in place we'd have a chance at catching errors even if unit tests are
lacking.

Having more of this in place would help improve quality regardless of
which CTR or RTC schema we use because we'll have evidence of how much we
should trust and the quality of code based on execution.
Without that I think the current RTC on release branches and trunk, and CTR
for feature branches scheme the enforces "trust but verify".

Jon.

On Sun, Aug 2, 2015 at 11:59 AM, Andrew Purtell <andrew.purtell@gmail.com>
wrote:

> Agreed, committers should be spending more time reviewing others' work. We
> can ask. It may happen. It may work for a short while. It may not. Shrug.
> People will do what they want.
>
> I'm looking to make one substantial change that will allow committers to
> make progress even if there's nobody else around or interested for one
> week. It happens sometimes. I've already talked about my concerns on
> assuming a certain level of available volunteer bandwidth. Let me just say
> that things are great now, it's fantastic.
>
> We are pretty much on the honor system already. I don't buy the argument
> we can't trust that CTR or CTR after one week can work because committers,
> even if asked to customarily get a review before commit, may decide to
> check in unreviewed untested destabilizing changes. At least, In that case
> I'd argue we have a different problem. If you go back to my original mail,
> I do say we shouldn't undertake any change if we as PMC are unwilling to
> revoke the commit bit from committers who sadly demonstrate themselves
> unworthy of trust through checkins of toxic waste without review. Merit
> _can_ be un-earned. Commits can be reverted. Just because something is
> checked in does not mean it will go out in a release. We could put a
> nightly suite of regression tests in place. Both proposals I have made,
> especially the latter, are not a binary release of any attempt at quality
> control. We would have all of the consensus expectations of good committer
> behavior still in place.
>
> Let's assume someone checks in something without getting a review today.
> What would happen? Someone else would revert it and we'd have a discussion.
> If we were operating under CTR-after-one-week (or plain CTR) but with
> documented expectations that someone get a review first, what changes, in
> terms of quality control and project discipline? Maybe the difference is we
> could more easily justify revoking commit privileges? Maybe. I think every
> discussion like that would be strictly case by case though, a conversation
> between that person and the PMC, justification for revoking committer
> status wouldn't be rule based.
>
> What definitely would change is my well tested good faith change waiting
> for review after one week could go in no matter who's on vacation or
> whatever. Not saying this is a problem for me today. Like I said, things
> are great around here today. Awesome.
>
>
> > On Aug 2, 2015, at 10:52 AM, Stack <stack@duboce.net> wrote:
> >
> > On Fri, Jul 31, 2015 at 3:15 AM, Andrew Purtell <
> andrew.purtell@gmail.com>
> > wrote:
> >
> >> I appreciate very much the earlier feedback about switching from RTC to
> >> CTR. It helped me think about the essential thing I was after.
> >>
> >> I'm thinking of making a formal proposal to adopt this, with a VOTE:
> >>
> >>> After posting a patch to JIRA, after one week if there is no review or
> >> veto, a committer can commit their own work.
> >>
> >> (On a bit of a lag...)
> >
> > What is the problem we are trying to solve? Blocked committers unable to
> > get a review?
> >
> > Any commit can destabilize. Commits without peer review will destabilize
> > more than those that have been reviewed. As a community, I'd think that
> > anything we can do to act against entropy in the system would be primary
> > above all other considerations especially given our project a mature
> > datastore.
> >
> > In my own case, it seems to take me tens of patch postings to arrive at
> > something a fellow reviewer thinks worthy of commit when working with a
> > peer on an issue of substance. Often the final patch is radically
> different
> > from first posting. If auto-commit, the first version would just go in.
> > Unless review, I'd be 'done'. If review, each iteration would be in the
> > code base? Churn would go up. Commits would be more and smaller. Our
> > project would be made of less coherent 'change sets'.
> >
> > I'd be with LarsH suggestion that committers should be allowed go ahead
> > with one-liners or more substantial, non-controversial changes in site or
> > doc but beyond this, I'd be against auto-commit.
> >
> >>
> >>
> >> Also, looking at https://hbase.apache.org/book.html#_decisions, I don't
> >> think the "patch +1 policy" should remain because the trial OWNERS
> concept
> >> hasn't worked out, IMHO. The OWNERS concept requires a set of constantly
> >> present and engaged owners, a resource demand that's hard to square with
> >> the volunteer nature of our community. The amount of time any committer
> or
> >> PMC member has on this project is highly variable day to day and week to
> >> week.  I'm also thinking of calling a VOTE to significantly revise or
> >> strike this section.
> > Yeah. OWNERS project failed. Its ghost is in effect in that we each go to
> > the domain expert when its clear (e.g. Matteo or Stephen on pv2).
> >
> >
> >> Both of these things have a common root: Volunteer time is a very
> precious
> >> commodity. Our community's supply of volunteer time fluctuates. I would
> >> like to see committers be able to make progress with their own work
> even in
> >> periods when volunteer time is in very short supply, or when they are
> >> working on niche concerns that simply do not draw sufficient interest
> from
> >> other committers. (This is different from work that people think isn't
> >> appropriate - in that case ignoring it so it will go away would no
> longer
> >> be an option, a veto would be required if you want to stop something.)
> > Committers should be spending more time reviewing their peers work (and
> > just as importantly, the work of contributors)?
> > St.Ack
> >
> >
> >
> >> On Wed, Jul 29, 2015 at 3:56 PM, Andrew Purtell <
> andrew.purtell@gmail.com>
> >> wrote:
> >>
> >>> Had this thought after getting back on the road. As an alternative to
> any
> >>> sweeping change we could do one incremental but very significant thing
> >> that
> >>> acknowledges our status as trusted and busy peers: After posting a
> patch
> >> to
> >>> JIRA, after one week if there is no review or veto, a committer can
> >> commit
> >>> their own work.
> >>>
> >>>
> >>>>> On Jul 29, 2015, at 2:20 PM, Mikhail Antonov <olorinbant@gmail.com>
> >>>> wrote:
> >>>>
> >>>> Just curious, I assume if this change is made, would it only apply to
> >>>> master branch?
> >>>>
> >>>> -Mikhail
> >>>>
> >>>> On Wed, Jul 29, 2015 at 2:09 PM, Andrew Purtell
> >>>> <andrew.purtell@gmail.com> wrote:
> >>>>> @dev is now CCed
> >>>>>
> >>>>> I didn't want to over structure the discussion with too much detail
> up
> >>> front. I do think CTR without supporting process or boundaries can be
> >> more
> >>> problematic than helpful. That could take the form of customarily
> waiting
> >>> for reviews before commit even under a CTR regime. I think this
> >> discussion
> >>> has been great so far. I would also add that CTR moves 'R' from a
> gating
> >>> requirement per commit (which can be hard to overcome for niche areas
> or
> >>> when volunteer resources are less available) more to RMs. will be back
> >>> later with more.
> >>>>>
> >>>>>
> >>>>>> On Jul 29, 2015, at 1:36 PM, Sean Busbey <sean.busbey@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>> I'd also favor having this discussion on dev@.
> >>>>>>
> >>>>>>> On Wed, Jul 29, 2015 at 2:29 PM, Gary Helmling <
> ghelmling@gmail.com
> >>>
> >>> wrote:
> >>>>>>>
> >>>>>>> This is already a really interesting and meaningful discussion, and
> >> is
> >>>>>>> obviously important to the community.
> >>>>>>>
> >>>>>>> Is there any reason not to move this straight to the dev@ list?
> >>>>>>>
> >>>>>>>> On Wed, Jul 29, 2015 at 11:40 AM Todd Lipcon <todd@cloudera.com>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> I'm not very active in HBase these days, so I won't cast a
> non-zero
> >>> vote,
> >>>>>>>> but I'm -0 on this idea, for basically two reasons:
> >>>>>>>>
> >>>>>>>> 1) In my experience at a past job which used CTR, the reality was
> >>> that it
> >>>>>>>> was more like "Commit and maybe review" rather than "Commit then
> >>> review".
> >>>>>>>> It's always more fun (and often easier) to write new code than to
> >>> review
> >>>>>>>> other people's code, so if there isn't a requirement that all code
> >>> gets
> >>>>>>>> reviewed before commit, it's easy for the ratio of code written to
> >>> code
> >>>>>>>> reviewed to tend towards values significantly greater than 1:1
> over
> >>> time.
> >>>>>>>> At my past job, this meant that a lot of code made it into
> >>> production (it
> >>>>>>>> was a website) that hadn't ever been reviewed, and in many cases
> we
> >>> found
> >>>>>>>> bugs (or other issues) that would have definitely been caught by a
> >>> good
> >>>>>>>> code reviewer.
> >>>>>>>>
> >>>>>>>> 2) CTR has both the advantage and disadvantage of allowing areas
> of
> >>> code
> >>>>>>> to
> >>>>>>>> be evolved quickly by a single person. That could be seen as a
> >> plus,
> >>> in
> >>>>>>>> that there are some areas which tend to get ignored because we
> >> don't
> >>>>>>> have a
> >>>>>>>> critical mass of people reviewing patches in the area -- patches
> >>> languish
> >>>>>>>> in these areas currently. However, that could also be seen as a
> >> good
> >>>>>>> thing
> >>>>>>>> from a "community over code" perspective -- it's better to not
> have
> >>> any
> >>>>>>>> areas of code with bus-factor 1. Feeling the pain of not getting
> >>> reviews
> >>>>>>> in
> >>>>>>>> these areas with only a single active committer encourages us to
> >> find
> >>>>>>>> solutions - either by deprecating "niche" features (as we once did
> >>> with
> >>>>>>>> various 'contrib' projects) or by recruiting new committers who
> >> have
> >>>>>>>> interest in maintaining that code area. Lifting review
> restrictions
> >>> would
> >>>>>>>> allow us to sweep bus-factor issues under the rug.
> >>>>>>>>
> >>>>>>>> That said, I think CTR can be a valuable tool for "move fast on
> new
> >>>>>>>> experimentation" type projects -- I've participated in several
> >>> feature
> >>>>>>>> branches in HDFS where we operated on CTR on the branch. The
> >>> assumption
> >>>>>>> was
> >>>>>>>> that, prior to merging the branch into trunk, all of the patches
> >> (or
> >>> a
> >>>>>>>> consolidated mega-patch) would be thoroughly reviewed by several
> >>> other
> >>>>>>>> committers. I found this to work very well during early
> >> development,
> >>>>>>> since
> >>>>>>>> I could hack on things and even commit pieces of code that had
> >> known
> >>>>>>>> issues/TODOs. For trickier patches on the CTR branch, I still
> >> tended
> >>> to
> >>>>>>>> ping experienced contributors for their opinions and feedback
> >> before
> >>>>>>>> committing.
> >>>>>>>>
> >>>>>>>> Perhaps such a hybrid policy would work well in the context of
> >> HBase
> >>>>>>>> feature development as well?
> >>>>>>>>
> >>>>>>>> -Todd
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Jul 29, 2015 at 11:27 AM, Andrew Purtell <
> >>> apurtell@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Just thought of branch merge votes. Sorry for that omission. I
> >>> think it
> >>>>>>>>> makes sense to keep those, but let's discuss that too in this
> >>> context.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, Jul 29, 2015 at 11:26 AM, Andrew Purtell <
> >>> apurtell@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> As Greg Stein said on a thread over at general@incubator
> >>>>>>>> ("[DISCUSSION]
> >>>>>>>>>> Graduate Ignite from the Apache Incubator"):
> >>>>>>>>>>
> >>>>>>>>>> I always translate RTC as "we don't trust you, so somebody else
> >>> must
> >>>>>>>>>> approve anything you do." To me, that is a lousy basis for
> >>> creating a
> >>>>>>>>>> community. Trust and peer respect should be the basis, which
> >>> implies
> >>>>>>>> CTR.
> >>>>>>>>>>
> >>>>>>>>>> I happen to agree with this sentiment and would like to propose
> >>>>>>>> switching
> >>>>>>>>>> our consensus on commit procedure from RTC to CTR. Concurrently,
> >> I
> >>>>>>>> would
> >>>>>>>>>> like to propose this consensus also include the following:
> >>>>>>>>>> - Checkins should pass precommit or the committer should explain
> >> on
> >>>>>>> the
> >>>>>>>>>> JIRA why precommit failures are in their best judgement not
> >> related
> >>>>>>>>>> - The PMC should be willing to, ultimately, revoke committership
> >>>>>>> should
> >>>>>>>>>> trust in any individual committer be discovered to be misplaced.
> >> I
> >>>>>>>> would
> >>>>>>>>>> expect such a last resort to be exceedingly rare and likely
> never
> >>> to
> >>>>>>>>> happen
> >>>>>>>>>> because of course long before that we would be setting correct
> >>> public
> >>>>>>>>>> examples in the first place and respectfully counseling well
> >>>>>>>> intentioned
> >>>>>>>>>> committers who might stray.
> >>>>>>>>>>
> >>>>>>>>>> Depending on how the conversation proceeds here I would like to
> >>>>>>> include
> >>>>>>>>>> dev@ in this conversation at the earliest opportunity as well.
> >>>>>>>>>>
> >>>>>>>>>> Thoughts?
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best regards,
> >>>>>>>>>>
> >>>>>>>>>> - Andy
> >>>>>>>>>>
> >>>>>>>>>> Problems worthy of attack prove their worth by hitting back. -
> >> Piet
> >>>>>>>> Hein
> >>>>>>>>>> (via Tom White)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Best regards,
> >>>>>>>>>
> >>>>>>>>> - Andy
> >>>>>>>>>
> >>>>>>>>> Problems worthy of attack prove their worth by hitting back. -
> >> Piet
> >>>>>>> Hein
> >>>>>>>>> (via Tom White)
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Todd Lipcon
> >>>>>>>> Software Engineer, Cloudera
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Sean
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks,
> >>>> Michael Antonov
> >>
>


-- 
// Jonathan Hsieh (shay)
// HBase Tech Lead, Software Engineer, Cloudera
// jon@cloudera.com // @jmhsieh

--001a11c3675eee5cda051cf7d4ed--