Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A44A917476 for ; Mon, 10 Aug 2015 22:50:22 +0000 (UTC) Received: (qmail 73729 invoked by uid 500); 10 Aug 2015 22:50:21 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 73640 invoked by uid 500); 10 Aug 2015 22:50:21 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 73620 invoked by uid 99); 10 Aug 2015 22:50:21 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Aug 2015 22:50:21 +0000 Received: from mail-la0-f41.google.com (mail-la0-f41.google.com [209.85.215.41]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id EC01A1A01A8; Mon, 10 Aug 2015 22:50:20 +0000 (UTC) Received: by lahi9 with SMTP id i9so31217983lah.2; Mon, 10 Aug 2015 15:50:19 -0700 (PDT) X-Received: by 10.112.148.162 with SMTP id tt2mr6849866lbb.121.1439247019600; Mon, 10 Aug 2015 15:50:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.28.138 with HTTP; Mon, 10 Aug 2015 15:49:39 -0700 (PDT) In-Reply-To: References: <7AEF2640-75AA-4688-8788-70234915916A@gmail.com> From: Andrew Purtell Date: Mon, 10 Aug 2015 15:49:39 -0700 Message-ID: Subject: Re: [DISCUSSION] Switching from RTC to CTR To: "private@hbase.apache.org" Cc: HBase Dev List Content-Type: multipart/alternative; boundary=047d7b3440aa58f0a2051cfccfda --047d7b3440aa58f0a2051cfccfda Content-Type: text/plain; charset=UTF-8 I don't think that's a digression from the discussion. I also think frequent integration tests will be useful whether RTC or CTR, just as a culture of code reviews is useful whether RTC or CTR. We could also look at resurrecting the static checking work. (Well, I guess that would be on me, HBASE-13365, etc.) When I was working on HBASE-13825 it occurred to me it should be easy to make a static check for invocation of a builder's mergeFrom or mergeDelimitedFrom, and make that a warning. This is one of those numerous details code reviewers should have in mind but won't manage it consistently because they change over time and vary by code line. On Mon, Aug 10, 2015 at 9:53 AM, Jonathan Hsieh wrote: > Sorry to digress from the title main topic but is the concern is about > quality of the system as code enters the system or about trust? > > I lean towards "trust but verify". I'd actually argue that we should have > more infrastructure in place to catch the bugs beyond just our unit test > infrastructure. Ideally our integration tests would be run frequently > upstream on our mainline branches. With those in place we could find > quality issues sooner rather than later. With integration test run infra > in place we'd have a chance at catching errors even if unit tests are > lacking. > > Having more of this in place would help improve quality regardless of > which CTR or RTC schema we use because we'll have evidence of how much we > should trust and the quality of code based on execution. > Without that I think the current RTC on release branches and trunk, and CTR > for feature branches scheme the enforces "trust but verify". > > Jon. > > On Sun, Aug 2, 2015 at 11:59 AM, Andrew Purtell > wrote: > > > Agreed, committers should be spending more time reviewing others' work. > We > > can ask. It may happen. It may work for a short while. It may not. Shrug. > > People will do what they want. > > > > I'm looking to make one substantial change that will allow committers to > > make progress even if there's nobody else around or interested for one > > week. It happens sometimes. I've already talked about my concerns on > > assuming a certain level of available volunteer bandwidth. Let me just > say > > that things are great now, it's fantastic. > > > > We are pretty much on the honor system already. I don't buy the argument > > we can't trust that CTR or CTR after one week can work because > committers, > > even if asked to customarily get a review before commit, may decide to > > check in unreviewed untested destabilizing changes. At least, In that > case > > I'd argue we have a different problem. If you go back to my original > mail, > > I do say we shouldn't undertake any change if we as PMC are unwilling to > > revoke the commit bit from committers who sadly demonstrate themselves > > unworthy of trust through checkins of toxic waste without review. Merit > > _can_ be un-earned. Commits can be reverted. Just because something is > > checked in does not mean it will go out in a release. We could put a > > nightly suite of regression tests in place. Both proposals I have made, > > especially the latter, are not a binary release of any attempt at quality > > control. We would have all of the consensus expectations of good > committer > > behavior still in place. > > > > Let's assume someone checks in something without getting a review today. > > What would happen? Someone else would revert it and we'd have a > discussion. > > If we were operating under CTR-after-one-week (or plain CTR) but with > > documented expectations that someone get a review first, what changes, in > > terms of quality control and project discipline? Maybe the difference is > we > > could more easily justify revoking commit privileges? Maybe. I think > every > > discussion like that would be strictly case by case though, a > conversation > > between that person and the PMC, justification for revoking committer > > status wouldn't be rule based. > > > > What definitely would change is my well tested good faith change waiting > > for review after one week could go in no matter who's on vacation or > > whatever. Not saying this is a problem for me today. Like I said, things > > are great around here today. Awesome. > > > > > > > On Aug 2, 2015, at 10:52 AM, Stack wrote: > > > > > > On Fri, Jul 31, 2015 at 3:15 AM, Andrew Purtell < > > andrew.purtell@gmail.com> > > > wrote: > > > > > >> I appreciate very much the earlier feedback about switching from RTC > to > > >> CTR. It helped me think about the essential thing I was after. > > >> > > >> I'm thinking of making a formal proposal to adopt this, with a VOTE: > > >> > > >>> After posting a patch to JIRA, after one week if there is no review > or > > >> veto, a committer can commit their own work. > > >> > > >> (On a bit of a lag...) > > > > > > What is the problem we are trying to solve? Blocked committers unable > to > > > get a review? > > > > > > Any commit can destabilize. Commits without peer review will > destabilize > > > more than those that have been reviewed. As a community, I'd think that > > > anything we can do to act against entropy in the system would be > primary > > > above all other considerations especially given our project a mature > > > datastore. > > > > > > In my own case, it seems to take me tens of patch postings to arrive at > > > something a fellow reviewer thinks worthy of commit when working with a > > > peer on an issue of substance. Often the final patch is radically > > different > > > from first posting. If auto-commit, the first version would just go in. > > > Unless review, I'd be 'done'. If review, each iteration would be in the > > > code base? Churn would go up. Commits would be more and smaller. Our > > > project would be made of less coherent 'change sets'. > > > > > > I'd be with LarsH suggestion that committers should be allowed go ahead > > > with one-liners or more substantial, non-controversial changes in site > or > > > doc but beyond this, I'd be against auto-commit. > > > > > >> > > >> > > >> Also, looking at https://hbase.apache.org/book.html#_decisions, I > don't > > >> think the "patch +1 policy" should remain because the trial OWNERS > > concept > > >> hasn't worked out, IMHO. The OWNERS concept requires a set of > constantly > > >> present and engaged owners, a resource demand that's hard to square > with > > >> the volunteer nature of our community. The amount of time any > committer > > or > > >> PMC member has on this project is highly variable day to day and week > to > > >> week. I'm also thinking of calling a VOTE to significantly revise or > > >> strike this section. > > > Yeah. OWNERS project failed. Its ghost is in effect in that we each go > to > > > the domain expert when its clear (e.g. Matteo or Stephen on pv2). > > > > > > > > >> Both of these things have a common root: Volunteer time is a very > > precious > > >> commodity. Our community's supply of volunteer time fluctuates. I > would > > >> like to see committers be able to make progress with their own work > > even in > > >> periods when volunteer time is in very short supply, or when they are > > >> working on niche concerns that simply do not draw sufficient interest > > from > > >> other committers. (This is different from work that people think isn't > > >> appropriate - in that case ignoring it so it will go away would no > > longer > > >> be an option, a veto would be required if you want to stop something.) > > > Committers should be spending more time reviewing their peers work (and > > > just as importantly, the work of contributors)? > > > St.Ack > > > > > > > > > > > >> On Wed, Jul 29, 2015 at 3:56 PM, Andrew Purtell < > > andrew.purtell@gmail.com> > > >> wrote: > > >> > > >>> Had this thought after getting back on the road. As an alternative to > > any > > >>> sweeping change we could do one incremental but very significant > thing > > >> that > > >>> acknowledges our status as trusted and busy peers: After posting a > > patch > > >> to > > >>> JIRA, after one week if there is no review or veto, a committer can > > >> commit > > >>> their own work. > > >>> > > >>> > > >>>>> On Jul 29, 2015, at 2:20 PM, Mikhail Antonov > > > >>>> wrote: > > >>>> > > >>>> Just curious, I assume if this change is made, would it only apply > to > > >>>> master branch? > > >>>> > > >>>> -Mikhail > > >>>> > > >>>> On Wed, Jul 29, 2015 at 2:09 PM, Andrew Purtell > > >>>> wrote: > > >>>>> @dev is now CCed > > >>>>> > > >>>>> I didn't want to over structure the discussion with too much detail > > up > > >>> front. I do think CTR without supporting process or boundaries can be > > >> more > > >>> problematic than helpful. That could take the form of customarily > > waiting > > >>> for reviews before commit even under a CTR regime. I think this > > >> discussion > > >>> has been great so far. I would also add that CTR moves 'R' from a > > gating > > >>> requirement per commit (which can be hard to overcome for niche areas > > or > > >>> when volunteer resources are less available) more to RMs. will be > back > > >>> later with more. > > >>>>> > > >>>>> > > >>>>>> On Jul 29, 2015, at 1:36 PM, Sean Busbey > > >>> wrote: > > >>>>>> > > >>>>>> I'd also favor having this discussion on dev@. > > >>>>>> > > >>>>>>> On Wed, Jul 29, 2015 at 2:29 PM, Gary Helmling < > > ghelmling@gmail.com > > >>> > > >>> wrote: > > >>>>>>> > > >>>>>>> This is already a really interesting and meaningful discussion, > and > > >> is > > >>>>>>> obviously important to the community. > > >>>>>>> > > >>>>>>> Is there any reason not to move this straight to the dev@ list? > > >>>>>>> > > >>>>>>>> On Wed, Jul 29, 2015 at 11:40 AM Todd Lipcon > > > >>> wrote: > > >>>>>>>> > > >>>>>>>> I'm not very active in HBase these days, so I won't cast a > > non-zero > > >>> vote, > > >>>>>>>> but I'm -0 on this idea, for basically two reasons: > > >>>>>>>> > > >>>>>>>> 1) In my experience at a past job which used CTR, the reality > was > > >>> that it > > >>>>>>>> was more like "Commit and maybe review" rather than "Commit then > > >>> review". > > >>>>>>>> It's always more fun (and often easier) to write new code than > to > > >>> review > > >>>>>>>> other people's code, so if there isn't a requirement that all > code > > >>> gets > > >>>>>>>> reviewed before commit, it's easy for the ratio of code written > to > > >>> code > > >>>>>>>> reviewed to tend towards values significantly greater than 1:1 > > over > > >>> time. > > >>>>>>>> At my past job, this meant that a lot of code made it into > > >>> production (it > > >>>>>>>> was a website) that hadn't ever been reviewed, and in many cases > > we > > >>> found > > >>>>>>>> bugs (or other issues) that would have definitely been caught > by a > > >>> good > > >>>>>>>> code reviewer. > > >>>>>>>> > > >>>>>>>> 2) CTR has both the advantage and disadvantage of allowing areas > > of > > >>> code > > >>>>>>> to > > >>>>>>>> be evolved quickly by a single person. That could be seen as a > > >> plus, > > >>> in > > >>>>>>>> that there are some areas which tend to get ignored because we > > >> don't > > >>>>>>> have a > > >>>>>>>> critical mass of people reviewing patches in the area -- patches > > >>> languish > > >>>>>>>> in these areas currently. However, that could also be seen as a > > >> good > > >>>>>>> thing > > >>>>>>>> from a "community over code" perspective -- it's better to not > > have > > >>> any > > >>>>>>>> areas of code with bus-factor 1. Feeling the pain of not getting > > >>> reviews > > >>>>>>> in > > >>>>>>>> these areas with only a single active committer encourages us to > > >> find > > >>>>>>>> solutions - either by deprecating "niche" features (as we once > did > > >>> with > > >>>>>>>> various 'contrib' projects) or by recruiting new committers who > > >> have > > >>>>>>>> interest in maintaining that code area. Lifting review > > restrictions > > >>> would > > >>>>>>>> allow us to sweep bus-factor issues under the rug. > > >>>>>>>> > > >>>>>>>> That said, I think CTR can be a valuable tool for "move fast on > > new > > >>>>>>>> experimentation" type projects -- I've participated in several > > >>> feature > > >>>>>>>> branches in HDFS where we operated on CTR on the branch. The > > >>> assumption > > >>>>>>> was > > >>>>>>>> that, prior to merging the branch into trunk, all of the patches > > >> (or > > >>> a > > >>>>>>>> consolidated mega-patch) would be thoroughly reviewed by several > > >>> other > > >>>>>>>> committers. I found this to work very well during early > > >> development, > > >>>>>>> since > > >>>>>>>> I could hack on things and even commit pieces of code that had > > >> known > > >>>>>>>> issues/TODOs. For trickier patches on the CTR branch, I still > > >> tended > > >>> to > > >>>>>>>> ping experienced contributors for their opinions and feedback > > >> before > > >>>>>>>> committing. > > >>>>>>>> > > >>>>>>>> Perhaps such a hybrid policy would work well in the context of > > >> HBase > > >>>>>>>> feature development as well? > > >>>>>>>> > > >>>>>>>> -Todd > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> On Wed, Jul 29, 2015 at 11:27 AM, Andrew Purtell < > > >>> apurtell@apache.org> > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Just thought of branch merge votes. Sorry for that omission. I > > >>> think it > > >>>>>>>>> makes sense to keep those, but let's discuss that too in this > > >>> context. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> On Wed, Jul 29, 2015 at 11:26 AM, Andrew Purtell < > > >>> apurtell@apache.org> > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> As Greg Stein said on a thread over at general@incubator > > >>>>>>>> ("[DISCUSSION] > > >>>>>>>>>> Graduate Ignite from the Apache Incubator"): > > >>>>>>>>>> > > >>>>>>>>>> I always translate RTC as "we don't trust you, so somebody > else > > >>> must > > >>>>>>>>>> approve anything you do." To me, that is a lousy basis for > > >>> creating a > > >>>>>>>>>> community. Trust and peer respect should be the basis, which > > >>> implies > > >>>>>>>> CTR. > > >>>>>>>>>> > > >>>>>>>>>> I happen to agree with this sentiment and would like to > propose > > >>>>>>>> switching > > >>>>>>>>>> our consensus on commit procedure from RTC to CTR. > Concurrently, > > >> I > > >>>>>>>> would > > >>>>>>>>>> like to propose this consensus also include the following: > > >>>>>>>>>> - Checkins should pass precommit or the committer should > explain > > >> on > > >>>>>>> the > > >>>>>>>>>> JIRA why precommit failures are in their best judgement not > > >> related > > >>>>>>>>>> - The PMC should be willing to, ultimately, revoke > committership > > >>>>>>> should > > >>>>>>>>>> trust in any individual committer be discovered to be > misplaced. > > >> I > > >>>>>>>> would > > >>>>>>>>>> expect such a last resort to be exceedingly rare and likely > > never > > >>> to > > >>>>>>>>> happen > > >>>>>>>>>> because of course long before that we would be setting correct > > >>> public > > >>>>>>>>>> examples in the first place and respectfully counseling well > > >>>>>>>> intentioned > > >>>>>>>>>> committers who might stray. > > >>>>>>>>>> > > >>>>>>>>>> Depending on how the conversation proceeds here I would like > to > > >>>>>>> include > > >>>>>>>>>> dev@ in this conversation at the earliest opportunity as > well. > > >>>>>>>>>> > > >>>>>>>>>> Thoughts? > > >>>>>>>>>> > > >>>>>>>>>> -- > > >>>>>>>>>> Best regards, > > >>>>>>>>>> > > >>>>>>>>>> - Andy > > >>>>>>>>>> > > >>>>>>>>>> Problems worthy of attack prove their worth by hitting back. - > > >> Piet > > >>>>>>>> Hein > > >>>>>>>>>> (via Tom White) > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> -- > > >>>>>>>>> Best regards, > > >>>>>>>>> > > >>>>>>>>> - Andy > > >>>>>>>>> > > >>>>>>>>> Problems worthy of attack prove their worth by hitting back. - > > >> Piet > > >>>>>>> Hein > > >>>>>>>>> (via Tom White) > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> -- > > >>>>>>>> Todd Lipcon > > >>>>>>>> Software Engineer, Cloudera > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> -- > > >>>>>> Sean > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> Thanks, > > >>>> Michael Antonov > > >> > > > > > > -- > // Jonathan Hsieh (shay) > // HBase Tech Lead, Software Engineer, Cloudera > // jon@cloudera.com // @jmhsieh > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --047d7b3440aa58f0a2051cfccfda--