From dev-return-31190-archive-asf-public=cust-asf.ponee.io@geode.apache.org Tue Jun 4 23:06:45 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 853D718064D for ; Wed, 5 Jun 2019 01:06:45 +0200 (CEST) Received: (qmail 74779 invoked by uid 500); 4 Jun 2019 23:06:44 -0000 Mailing-List: contact dev-help@geode.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@geode.apache.org Delivered-To: mailing list dev@geode.apache.org Received: (qmail 74765 invoked by uid 99); 4 Jun 2019 23:06:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2019 23:06:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E087A180C4B for ; Tue, 4 Jun 2019 23:06:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.699 X-Spam-Level: X-Spam-Status: No, score=-0.699 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id eaVj03yM3rTK for ; Tue, 4 Jun 2019 23:06:39 +0000 (UTC) Received: from mx0b-00296801.pphosted.com (mx0b-00296801.pphosted.com [148.163.153.148]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id A424A5F19A for ; Tue, 4 Jun 2019 23:06:39 +0000 (UTC) Received: from pps.filterd (m0114584.ppops.net [127.0.0.1]) by mx0b-00296801.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x54N2Hvt024447 for ; Tue, 4 Jun 2019 23:06:39 GMT Received: from mail-ot1-f71.google.com (mail-ot1-f71.google.com [209.85.210.71]) by mx0b-00296801.pphosted.com with ESMTP id 2suh4gb9er-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 04 Jun 2019 23:06:38 +0000 Received: by mail-ot1-f71.google.com with SMTP id a22so11353890otr.21 for ; Tue, 04 Jun 2019 16:06:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=lvq57yxmXe5XUccjybpYprLXIW3GteNz1TvaqHNVjN4=; b=r0nFQe1JuZIwRhEe3UJ6Ifk6k5Fhbg+JgpMqnj2rCAEtxXRdD3krTEfyTdLG22gq/f yFkXW8zfgyUs1OTkmClX/gW9cUuTGlZVzFxm3KSMttYENUBt88f60LktApCc5vdi8jPG MYuajixJ3wAgHAqgjnXohHnLQAOY6Qd8gl8cOmVfz8PDFIgqp+YIZR817B4QQsGtC1XU AcV0NbJH4dpfrpfPxpDmCo2pJUcSbeJ04G92w70CTswqbgvwIU+bo2IBHw68AuKWrFXw wvQemSIuS8s4iji10MW5kxl9XN1ok4XpsUANAmtR4fOhzOvwgwRTJYlJTa2KqaAgVKN4 q8Dg== X-Gm-Message-State: APjAAAXA3O26yAO1bm7cL3kLP22EU+W6AMjEEMaHLFuYeb2dwUAZuyCW OC1kfMILIZIJE8CworpxEd2xiM1+kVTiYvef8f8rsqVHaZ9tfiP0xMi4x88FLodlPBXCk7014We l8/HL5uQX7SlAwwiyT7CjuIGsY23X/QltEz3qq/g= X-Received: by 2002:a9d:27a4:: with SMTP id c33mr6933956otb.113.1559689597374; Tue, 04 Jun 2019 16:06:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqyB6eVllVgwArgHNFnEf2eELycnobd9DbsTn1Fc64jQlhbczWeHK2kfTA7k6maUpZXhz6rvAw== X-Received: by 2002:a9d:27a4:: with SMTP id c33mr6933923otb.113.1559689596932; Tue, 04 Jun 2019 16:06:36 -0700 (PDT) Received: from [10.118.20.37] (50-203-225-134-static.hfc.comcastbusiness.net. [50.203.225.134]) by smtp.gmail.com with ESMTPSA id k83sm6831130oia.10.2019.06.04.16.06.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Jun 2019 16:06:36 -0700 (PDT) From: Owen Nichols Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: [DISCUSS] Disable merge for failing pull requests Date: Tue, 4 Jun 2019 16:06:28 -0700 References: <9a01f9ff-5aad-0d34-b822-ae9354ae251e@pivotal.io> <8e63f75b-4798-e7c4-13ce-1b94621a6f03@apache.org> To: dev@geode.apache.org In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3445.104.11) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-06-04_15:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040145 I=E2=80=99d like to follow up on this discussion from late last year. = Six months ago, Kirk wrote: > After we get it more consistently GREEN, I would be willing to change = my vote to +1. While we=E2=80=99re still not perfect, I have noticed that PR checks go = green much more consistently now than they did six months ago. Also, = Ryan wrote: >>>>>>> I think we'd need clear guidelines on what to do if your PR = fails due to something seemingly unrelated. If you still encounter a flaky failure occasionally, you can either = re-trigger all checks with an empty commit, or just ask on the dev list = and someone will be happy to help you re-trigger only your failed check. The above concerns were commonly cited as reasons for not moving ahead = with the proposal to enable GitHub policy to disable merge button until = checks have passed. Even with these addressed, there is still a = =E2=80=9Cpeople over process=E2=80=9D argument to be made for not = imposing an enforced process (see recent thread that rejected imposing a = requirement of >0 reviews before allowing merge). So, is there any interest at all in tightening GitHub controls? Or = maybe a better way to approach the question is: are we Very Happy with = our current source control practices? -Owen > On Dec 26, 2018, at 4:03 PM, Kirk Lund wrote: >=20 > I'm changing my vote to -1 for disallowing merge until precheck = passes. >=20 > The reason is that it's rare for me to see a 100% clean precheckin for = any > of my PRs. There seems to always be some failure unrelated to my PR. = For > example PR #3042 just hit GEODE-6008. If we make the change to disable = the > merge button, then my PR could potentially be blocked indefinitely. >=20 > After we get it more consistently GREEN, I would be willing to change = my > vote to +1. >=20 > On Fri, Dec 21, 2018 at 10:36 AM Kirk Lund wrote: >=20 >> I was responding to Udo's comment: >>=20 >> "Could one not configure the button that the owner of the PR cannot = merge >> the PR?" >>=20 >> I'm +1 for disallowing merge until precheck passes. >> I'm -1 for disallowing the owner of the PR to merge it. >>=20 >> On Fri, Dec 21, 2018 at 9:28 AM Helena Bales = wrote: >>=20 >>> Kirk, this change would not require you to get someone to merge it. = It >>> would just require that your PR pass CI before it can be merged. >>>=20 >>> On Thu, Dec 20, 2018 at 2:38 PM Kirk Lund wrote: >>>=20 >>>> I have enough trouble just getting other developers to review my = PR. I >>>> don't want to have to struggle to find someone to merge it for me, = too. >>>>=20 >>>> On Mon, Nov 19, 2018 at 4:09 PM Udo Kohlmeyer = wrote: >>>>=20 >>>>> I don't believe "name and shame" is a hammer we should wield, but = if >>> we >>>>> have use it... use it wisely >>>>>=20 >>>>> Could one not configure the button that the owner of the PR cannot >>> merge >>>>> the PR? >>>>>=20 >>>>> --Udo >>>>>=20 >>>>>=20 >>>>> On 11/19/18 16:03, Dan Smith wrote: >>>>>> Closing the loop on this thread - I don't feel like there was = enough >>>>>> agreement to go forwards with disabling the merge button, so I'm >>> going >>>> to >>>>>> drop this for now. >>>>>>=20 >>>>>> I would like to see everyone make sure that they only merge green >>> PRs. >>>>>> Maybe we can go with a name and shame approach? As in, we = shouldn't >>> see >>>>> any >>>>>> new PRs show up in this query: >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>> = https://github.com/apache/geode/pulls?utf8=3D%E2%9C%93&q=3Dis%3Apr+is%3Ame= rged+status%3Afailure >>>>>>=20 >>>>>> -Dan >>>>>>=20 >>>>>> On Tue, Nov 13, 2018 at 10:19 AM Ryan McMahon = >>>>> wrote: >>>>>>=20 >>>>>>> +1 I like this idea, but I recognize that it will be a challenge >>> when >>>>> there >>>>>>> is still some flakiness to the pipeline. I think we'd need = clear >>>>>>> guidelines on what to do if your PR fails due to something >>> seemingly >>>>>>> unrelated. For instance, we ran into GEODE-5943 (flaky >>>>> EvictionDUnitTest) >>>>>>> in our last PR, and saw that there was already an open ticket = for >>> the >>>>>>> flakiness (we even reverted our change and reproduced to be >>> sure). So >>>>> we >>>>>>> triggered another PR pipeline and it passed the second time. Is >>>>> rerunning >>>>>>> the pipeline again sufficient in this case? Or should we have >>> stopped >>>>> what >>>>>>> we were doing and took up GEODE-5943, assuming it wasn't = assigned >>> to >>>>>>> someone? If it was already assigned to someone, do we wait = until >>> that >>>>> bug >>>>>>> is fixed before we run through the PR pipeline again? >>>>>>>=20 >>>>>>> I think if anything this will help us treat bugs that are = causing a >>>> red >>>>>>> pipeline as critical, and I think that is a good thing. So I'm >>> still >>>> +1 >>>>>>> despite the flakiness. Just curious what people think about how = we >>>>> should >>>>>>> handle cases where there is a known failure and it is indeed >>> unrelated >>>>> to >>>>>>> our PR. >>>>>>>=20 >>>>>>> Ryan >>>>>>>=20 >>>>>>>=20 >>>>>>> On Mon, Nov 12, 2018 at 2:49 PM Jinmei Liao >>>> wrote: >>>>>>>=20 >>>>>>>> Just to clarify, that flaky EvictionDUnitTest is old flaky. The >>> PR to >>>>>>>> refactor the test passed all checks, even the repeatTest as = well. >>> I >>>>> had a >>>>>>>> closed PR that just touched the "un-refactored" >>> EvictionDUnitTest, it >>>>>>>> wouldn't even pass the repeatTest at all. >>>>>>>>=20 >>>>>>>> On Mon, Nov 12, 2018 at 2:04 PM Dan Smith >>> wrote: >>>>>>>>=20 >>>>>>>>> To be clear, I don't think we have an issue of untrustworthy >>>>> committers >>>>>>>>> pushing code they know is broken to develop. >>>>>>>>>=20 >>>>>>>>> The problem is that it is all to easy to look at a PR with = some >>>>>>> failures >>>>>>>>> and *assume* your code didn't cause the failures and merge it >>>> anyway. >>>>> I >>>>>>>>> think we should all be at least rerunning the tests and not >>> merging >>>>> the >>>>>>>> PR >>>>>>>>> until everything passes. If the merge button is greyed out, = that >>>> might >>>>>>>> help >>>>>>>>> communicate that standard to everyone. >>>>>>>>>=20 >>>>>>>>> Looking at the OpenJDK 8 metrics, it looks to me like most of = the >>>>>>> issues >>>>>>>>> are recently introduced (builds 81-84 and the = EvictionDUnitTest), >>>> not >>>>>>> old >>>>>>>>> flaky tests. So I think we were a little more disciplined = always >>>>>>>> listening >>>>>>>>> to what the checks are telling us we would have less noise in = the >>>> long >>>>>>>> run. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>=20 >>>>>=20 >>>>=20 >>> = https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__concourse.apachegeo= de-2Dci.info_teams_main_pipelines_apache-2Ddevelop-2Dmetrics_jobs_GeodeDis= tributedTestOpenJDK8Metrics_builds_23&d=3DDwIBaQ&c=3Dlnl9vOaLMzsy2niBC8-h_= K-7QJuNJEsFrzdndhuJ3Sw&r=3Ds8zALi1UpxiUlTfCkFIvTI7Yi4EtlpqAZ68hQ4JDyaI&m=3D= EBW_QlDSlKgshy50KztUi566idyTMguNUkj6wc1jLXo&s=3DtgtdFeHVZtk1hlNfH-VTlrV9-W= kUt_tWv_yx7MjSUdo&e=3D >>>>>>>>> -Dan >>>>>>>>>=20 >>>>>>>>> On Mon, Nov 12, 2018 at 11:21 AM Udo Kohlmeyer = >>>>> wrote: >>>>>>>>>=20 >>>>>>>>>> 0 >>>>>>>>>>=20 >>>>>>>>>> Patrick does make a point. The committers on the project have >>> been >>>>>>>> voted >>>>>>>>>> in as "responsible, trustworthy and best of breed" and rights >>> and >>>>>>>>>> privileges according to those beliefs have been bestowed. >>>>>>>>>>=20 >>>>>>>>>> We should live those words and trust our committers. In the >>> end.. >>>> If >>>>>>>>>> there is a "rotten" apple in the mix this should be = addressed, >>>> maybe >>>>>>>> not >>>>>>>>>> as public flogging, but with stern communications. >>>>>>>>>>=20 >>>>>>>>>> On the other side, I've also seen the model where the = submitter >>> of >>>> PR >>>>>>>> is >>>>>>>>>> not allowed to merge + commit their own PR's. That befalls to >>>> another >>>>>>>>>> committer to complete this task, avoiding the whole, "I'll = just >>>>>>> quickly >>>>>>>>>> commit without due diligence". >>>>>>>>>>=20 >>>>>>>>>> --Udo >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> On 11/12/18 10:23, Patrick Rhomberg wrote: >>>>>>>>>>> -1 >>>>>>>>>>>=20 >>>>>>>>>>> I really don't think this needs to be codified. If people = are >>>>>>>> merging >>>>>>>>>> red >>>>>>>>>>> PRs, that is a failing as a responsible developer. = Defensive >>>>>>>>> programming >>>>>>>>>>> is all well and good, but this seems like it's a bit beyond = the >>>>>>> pale >>>>>>>> in >>>>>>>>>>> that regard. I foresee it making the correction of a red = main >>>>>>>> pipeline >>>>>>>>>>> must more difficult that it needs to be. And while it's = much >>>>>>> better >>>>>>>>> than >>>>>>>>>>> it had been, I am (anecdotally) still seeing plenty of >>> flakiness >>>> in >>>>>>>> my >>>>>>>>>> PRs, >>>>>>>>>>> either resulting from infra failures or test classes that = need >>> to >>>>>>> be >>>>>>>>>>> refactored or reevaluated. >>>>>>>>>>>=20 >>>>>>>>>>> If someone is merging truly broken code that has failed >>>> precheckin, >>>>>>>>> that >>>>>>>>>>> seems to me to be a discussion to have with that person. = >>> If >>>> it >>>>>>>>>>> persists, perhaps a public flogging would be in order. >>> But at >>>>>>>> the >>>>>>>>>> end >>>>>>>>>>> of the day, the onus is on us to be responsible developers, >>> and no >>>>>>>>> amount >>>>>>>>>>> of gatekeeping is going to be a substitute for that. >>>>>>>>>>>=20 >>>>>>>>>>> On Mon, Nov 12, 2018 at 9:38 AM, Galen O'Sullivan < >>>>>>>>> gosullivan@pivotal.io >>>>>>>>>>> wrote: >>>>>>>>>>>=20 >>>>>>>>>>>> I'm in favor of this change, but only if we have a way to >>> restart >>>>>>>>>> failing >>>>>>>>>>>> PR builds without being the PR submitter. Any committer >>> should be >>>>>>>> able >>>>>>>>>> to >>>>>>>>>>>> restart the build. The pipeline is still flaky enough and I >>> want >>>>>>> to >>>>>>>>>> avoid >>>>>>>>>>>> the situation where a new contributor is asked repeatedly = to >>> push >>>>>>>>> empty >>>>>>>>>>>> commits to trigger a PR build (in between actual PR review) >>> and >>>>>>>> their >>>>>>>>> PR >>>>>>>>>>>> gets delayed by days if not weeks. That's a real bad >>> experience >>>>>>> for >>>>>>>>> the >>>>>>>>>>>> people we want to be inviting in. >>>>>>>>>>>>=20 >>>>>>>>>>>> On Mon, Nov 12, 2018 at 9:23 AM Alexander Murmann < >>>>>>>>> amurmann@pivotal.io> >>>>>>>>>>>> wrote: >>>>>>>>>>>>=20 >>>>>>>>>>>>> What's the general consensus on flakiness of the pipeline = for >>>>>>> this >>>>>>>>>>>> purpose? >>>>>>>>>>>>> If there is consensus that it's still too flaky to disable >>> the >>>>>>>> merge >>>>>>>>>>>> button >>>>>>>>>>>>> on failure, we should probably consider doubling down on = that >>>>>>> issue >>>>>>>>>>>> again. >>>>>>>>>>>>> It's hard to tell from just looking at the dev pipeline >>> because >>>>>>> you >>>>>>>>>>>> cannot >>>>>>>>>>>>> see easily what failures were legitimate. >>>>>>>>>>>>>=20 >>>>>>>>>>>>> On Mon, Nov 12, 2018 at 8:47 AM Bruce Schuchardt < >>>>>>>>>> bschuchardt@pivotal.io >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>=20 >>>>>>>>>>>>>> I'm in favor of this. >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> Several times over the years we've made a big push to get >>>>>>>> precheckin >>>>>>>>>> to >>>>>>>>>>>>>> reliably only to see rapid degradation due to crappy code >>> being >>>>>>>>> pushed >>>>>>>>>>>>>> in the face of precheckin failures. We've just invested >>>> another >>>>>>>>> half >>>>>>>>>>>>>> year doing it again. Are we going to do things = differently >>>> now? >>>>>>>>>>>>>> Disabling the Merge button on test failure might be a = good >>>>>>> start. >>>>>>>>>>>>>> On 11/9/18 12:55 PM, Dan Smith wrote: >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>=20 >>>>>>>>>>>>>>> Kirks emails reminded me - I think we are at the point = now >>>> with >>>>>>>> our >>>>>>>>>>>> PR >>>>>>>>>>>>>>> checks where we should not be merging anything to = develop >>> that >>>>>>>>>>>> doesn't >>>>>>>>>>>>>> pass >>>>>>>>>>>>>>> all of the PR checks. >>>>>>>>>>>>>>>=20 >>>>>>>>>>>>>>> I propose we disable the merge button unless a PR is >>> passing >>>>>>> all >>>>>>>> of >>>>>>>>>>>> the >>>>>>>>>>>>>>> checks. If we are in agreement I'll follow up with infra = to >>>> see >>>>>>>> how >>>>>>>>>>>> to >>>>>>>>>>>>>> make >>>>>>>>>>>>>>> that happen. >>>>>>>>>>>>>>>=20 >>>>>>>>>>>>>>> This would not completely prevent pushing directly to >>> develop >>>>>>>> from >>>>>>>>>>>> the >>>>>>>>>>>>>>> command line, but since most developers seem to be using >>> the >>>>>>>> github >>>>>>>>>>>>> UI, I >>>>>>>>>>>>>>> hope that it will steer people towards getting the PRs >>> passing >>>>>>>>>>>> instead >>>>>>>>>>>>> of >>>>>>>>>>>>>>> using the command line. >>>>>>>>>>>>>>>=20 >>>>>>>>>>>>>>> Thoughts? >>>>>>>>>>>>>>> -Dan >>>>>>>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>=20 >>>>>>>> -- >>>>>>>> Cheers >>>>>>>>=20 >>>>>>>> Jinmei >>>>>>>>=20 >>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>=20