From dev-return-29609-archive-asf-public=cust-asf.ponee.io@geode.apache.org Wed Sep 5 19:21:00 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 6C11C180654 for ; Wed, 5 Sep 2018 19:20:59 +0200 (CEST) Received: (qmail 33738 invoked by uid 500); 5 Sep 2018 17:20:58 -0000 Mailing-List: contact dev-help@geode.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@geode.apache.org Delivered-To: mailing list dev@geode.apache.org Received: (qmail 33711 invoked by uid 99); 5 Sep 2018 17:20:58 -0000 Received: from mail-relay.apache.org (HELO mailrelay1-lw-us.apache.org) (207.244.88.152) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Sep 2018 17:20:58 +0000 Received: from mail-lj1-f198.google.com (mail-lj1-f198.google.com [209.85.208.198]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 5D526D27 for ; Wed, 5 Sep 2018 17:20:57 +0000 (UTC) Received: by mail-lj1-f198.google.com with SMTP id e12-v6so1712034ljk.3 for ; Wed, 05 Sep 2018 10:20:57 -0700 (PDT) X-Gm-Message-State: APzg51CydW6tvdGXNPzz45L0dTPdN2Ee+ZgcVjzfBybAYSLQEk8rDfcN 3KJ4ACSkAWsXZ6JJMQY2r5xmCFqHSqp/c7D1csZa9i1zs+FsAJCHjJr/AOYiUQwRZUClPWQOorK 4EeK8HgU+eAveTLhVc4z3QudV1gfy7DiidBLLyincUFESWWz9cHdKnIA= X-Received: by 2002:a19:c70a:: with SMTP id x10-v6mr13111923lff.148.1536168055874; Wed, 05 Sep 2018 10:20:55 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbyrHBkWoQ6L0vDld4qPftSlBl2hBhNOb9cR2Fqny0u776DOykn/EJRcIhXKRbG+uH7+n/6k6Cn4LFnvwUXCv0= X-Received: by 2002:a19:c70a:: with SMTP id x10-v6mr13111919lff.148.1536168055772; Wed, 05 Sep 2018 10:20:55 -0700 (PDT) MIME-Version: 1.0 References: <382ffe87-c43c-a1cf-a953-62a9e5079855@pivotal.io> <15E103D6-7C80-4A26-92BD-FC9FC0679C9D@pivotal.io> In-Reply-To: From: Nabarun Nag Date: Wed, 5 Sep 2018 10:20:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: 2 minute gateway startup time due to GEODE-5591 To: dev@geode.apache.org Content-Type: multipart/alternative; boundary="00000000000047ba04057522ffc1" --00000000000047ba04057522ffc1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable GEODE-5591 has been reverted in develop ref: 901da27f227a8ce2b7d6b681619782a1accd9330 Regards Nabarun Nag On Wed, Sep 5, 2018 at 10:14 AM Ryan McMahon wrote: > +1 for reverting in both places. > > I see that there is already an isGatewayReceiver flag in the AcceptorImpl > constructor. It's not ideal, but could we use this flag to prevent the 2 > minute retry logic for happening if this flag is true? > > Ryan > > On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey < > lhughesgodfrey@pivotal.io> wrote: > > > +1 for reverting in both places. > > > > On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith wrote: > > > > > +1 for reverting in both places. The current fix is not better, that'= s > > why > > > we are reverting it on the release branch! > > > > > > -Dan > > > > > > On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett > > wrote: > > > > > > > I=E2=80=99m not ok with reverting in develop. Revert in 1.7 and mod= ify in > > > develop. > > > > We shouldn=E2=80=99t go backwards in develop. The current fix is be= tter than > > the > > > > bug it fixes. > > > > > > > > > On Sep 5, 2018, at 9:40 AM, Nabarun Nag wrote: > > > > > > > > > > If everyone is okay with it, I will revert that change in develop > and > > > > then > > > > > cherry pick it to release/1.7.0 branch. > > > > > Please do comment. > > > > > > > > > > Regards > > > > > Nabarun Nag > > > > > > > > > > > > > > >> On Wed, Sep 5, 2018 at 9:30 AM Dan Smith > wrote: > > > > >> > > > > >> +1 to yank it and rework the fix. > > > > >> > > > > >> Gester's change helps, but it just means that you will sometimes > > > > randomly > > > > >> have a 2 minute delay starting up a gateway receiver. I don't > think > > > > that is > > > > >> a great user experience either. > > > > >> > > > > >> -Dan > > > > >> > > > > >> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt < > > > > bschuchardt@pivotal.io> > > > > >> wrote: > > > > >> > > > > >>> Let's yank it > > > > >>> > > > > >>> > > > > >>> > > > > >>>> On 9/4/18 5:04 PM, Sean Goller wrote: > > > > >>>> > > > > >>>> If it's to get the release out, I'm fine with reverting. I don= 't > > > like > > > > >> it, > > > > >>>> but I'm not willing to die on that hill. :) > > > > >>>> > > > > >>>> -S. > > > > >>>> > > > > >>>> On Tue, Sep 4, 2018 at 4:38 PM Dan Smith > > wrote: > > > > >>>> > > > > >>>> Spitting this into a separate thread. > > > > >>>>> > > > > >>>>> I see the issue. The two minute timeout is the constructor fo= r > > > > >>>>> AcceptorImpl, where it retries to bind for 2 minutes. > > > > >>>>> > > > > >>>>> That behavior makes sense for CacheServer.start. > > > > >>>>> > > > > >>>>> But it doesn't make sense for the new logic in > > > > GatewayReceiver.start() > > > > >>>>> from > > > > >>>>> GEODE-5591. That code is trying to use CacheServer.start to > scan > > > for > > > > an > > > > >>>>> available port, trying each port in a range. That free port > > finding > > > > >> logic > > > > >>>>> really doesn't want to have two minutes of retries for each > port. > > > It > > > > >>>>> seems > > > > >>>>> like we need to rework the fix for GEODE-5591. > > > > >>>>> > > > > >>>>> Does it make sense to hold up the release to rework this fix, > or > > > > should > > > > >>>>> we > > > > >>>>> just revert it? Have we switched concourse over to using alpi= ne > > > > linux, > > > > >>>>> which I think was the original motivation for this fix? > > > > >>>>> > > > > >>>>> -Dan > > > > >>>>> > > > > >>>>> On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith > > > wrote: > > > > >>>>> > > > > >>>>> Why is it waiting at all in this case? Where is this 2 minute > > > timeout > > > > >>>>>> coming from? > > > > >>>>>> > > > > >>>>>> -Dan > > > > >>>>>> > > > > >>>>>> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda < > > > > >>>>>> > > > > >>>>> sai.boorlagadda@gmail.com > > > > >>>>> > > > > >>>>>> wrote: > > > > >>>>>>> So the issue is that it takes longer to start than previous > > > > releases? > > > > >>>>>>> Also, is this wait time only when using Gfsh to create > > > > >>>>>>> gateway-receiver? > > > > >>>>>>> > > > > >>>>>>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag > > > > wrote: > > > > >>>>>>> > > > > >>>>>>> Currently we have a minor issue in the release branch as > > pointed > > > > out > > > > >>>>>>>> > > > > >>>>>>> by > > > > >>>>> > > > > >>>>>> Barry O. > > > > >>>>>>>> We will wait till a resolution is figured out for this > issue. > > > > >>>>>>>> > > > > >>>>>>>> Steps: > > > > >>>>>>>> 1. create locator > > > > >>>>>>>> 2. start server --name=3Dserver1 --server-port=3D40404 > > > > >>>>>>>> 3. start server --name=3Dserver2 --server-port=3D40405 > > > > >>>>>>>> 4. create gateway-receiver --member=3Dserver1 > > > > >>>>>>>> 5. create gateway-receiver --member=3Dserver2 `This gets s= tuck > > > for 2 > > > > >>>>>>>> > > > > >>>>>>> minutes` > > > > >>>>>>> > > > > >>>>>>>> Is the 2 minute wait time acceptable? Should we document i= t? > > > When > > > > we > > > > >>>>>>>> > > > > >>>>>>> revert > > > > >>>>>>> > > > > >>>>>>>> GEODE-5591, this issue does not happen. > > > > >>>>>>>> > > > > >>>>>>>> Regards > > > > >>>>>>>> Nabarun Nag > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>> > > > > >> > > > > > > > > > > --00000000000047ba04057522ffc1--