geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan McMahon <rmcma...@pivotal.io>
Subject Re: 2 minute gateway startup time due to GEODE-5591
Date Wed, 05 Sep 2018 17:14:45 GMT
+1 for reverting in both places.

I see that there is already an isGatewayReceiver flag in the AcceptorImpl
constructor.  It's not ideal, but could we use this flag to prevent the 2
minute retry logic for happening if this flag is true?

Ryan

On Wed, Sep 5, 2018 at 10:01 AM, Lynn Hughes-Godfrey <
lhughesgodfrey@pivotal.io> wrote:

> +1 for reverting in both places.
>
> On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith <dsmith@pivotal.io> wrote:
>
> > +1 for reverting in both places. The current fix is not better, that's
> why
> > we are reverting it on the release branch!
> >
> > -Dan
> >
> > On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett <jbarrett@pivotal.io>
> wrote:
> >
> > > I’m not ok with reverting in develop. Revert in 1.7 and modify in
> > develop.
> > > We shouldn’t go backwards in develop. The current fix is better than
> the
> > > bug it fixes.
> > >
> > > > On Sep 5, 2018, at 9:40 AM, Nabarun Nag <nnag@apache.org> wrote:
> > > >
> > > > If everyone is okay with it, I will revert that change in develop and
> > > then
> > > > cherry pick it to release/1.7.0 branch.
> > > > Please do comment.
> > > >
> > > > Regards
> > > > Nabarun Nag
> > > >
> > > >
> > > >> On Wed, Sep 5, 2018 at 9:30 AM Dan Smith <dsmith@pivotal.io>
wrote:
> > > >>
> > > >> +1 to yank it and rework the fix.
> > > >>
> > > >> Gester's change helps, but it just means that you will sometimes
> > > randomly
> > > >> have a 2 minute delay starting up a gateway receiver. I don't think
> > > that is
> > > >> a great user experience either.
> > > >>
> > > >> -Dan
> > > >>
> > > >> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <
> > > bschuchardt@pivotal.io>
> > > >> wrote:
> > > >>
> > > >>> Let's yank it
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On 9/4/18 5:04 PM, Sean Goller wrote:
> > > >>>>
> > > >>>> If it's to get the release out, I'm fine with reverting. I
don't
> > like
> > > >> it,
> > > >>>> but I'm not willing to die on that hill. :)
> > > >>>>
> > > >>>> -S.
> > > >>>>
> > > >>>> On Tue, Sep 4, 2018 at 4:38 PM Dan Smith <dsmith@pivotal.io>
> wrote:
> > > >>>>
> > > >>>> Spitting this into a separate thread.
> > > >>>>>
> > > >>>>> I see the issue. The two minute timeout is the constructor
for
> > > >>>>> AcceptorImpl, where it retries to bind for 2 minutes.
> > > >>>>>
> > > >>>>> That behavior makes sense for CacheServer.start.
> > > >>>>>
> > > >>>>> But it doesn't make sense for the new logic in
> > > GatewayReceiver.start()
> > > >>>>> from
> > > >>>>> GEODE-5591. That code is trying to use CacheServer.start
to scan
> > for
> > > an
> > > >>>>> available port, trying each port in a range. That free
port
> finding
> > > >> logic
> > > >>>>> really doesn't want to have two minutes of retries for
each port.
> > It
> > > >>>>> seems
> > > >>>>> like we need to rework the fix for GEODE-5591.
> > > >>>>>
> > > >>>>> Does it make sense to hold up the release to rework this
fix, or
> > > should
> > > >>>>> we
> > > >>>>> just revert it? Have we switched concourse over to using
alpine
> > > linux,
> > > >>>>> which I think was the original motivation for this fix?
> > > >>>>>
> > > >>>>> -Dan
> > > >>>>>
> > > >>>>> On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith <dsmith@pivotal.io>
> > wrote:
> > > >>>>>
> > > >>>>> Why is it waiting at all in this case? Where is this 2
minute
> > timeout
> > > >>>>>> coming from?
> > > >>>>>>
> > > >>>>>> -Dan
> > > >>>>>>
> > > >>>>>> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <
> > > >>>>>>
> > > >>>>> sai.boorlagadda@gmail.com
> > > >>>>>
> > > >>>>>> wrote:
> > > >>>>>>> So the issue is that it takes longer to start
than previous
> > > releases?
> > > >>>>>>> Also, is this wait time only when using Gfsh to
create
> > > >>>>>>> gateway-receiver?
> > > >>>>>>>
> > > >>>>>>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag <nnag@apache.org>
> > > wrote:
> > > >>>>>>>
> > > >>>>>>> Currently we have a minor issue in the release
branch as
> pointed
> > > out
> > > >>>>>>>>
> > > >>>>>>> by
> > > >>>>>
> > > >>>>>> Barry O.
> > > >>>>>>>> We will wait till a resolution is figured
out for this issue.
> > > >>>>>>>>
> > > >>>>>>>> Steps:
> > > >>>>>>>> 1. create locator
> > > >>>>>>>> 2. start server --name=server1 --server-port=40404
> > > >>>>>>>> 3. start server --name=server2 --server-port=40405
> > > >>>>>>>> 4. create gateway-receiver --member=server1
> > > >>>>>>>> 5. create gateway-receiver --member=server2
`This gets stuck
> > for 2
> > > >>>>>>>>
> > > >>>>>>> minutes`
> > > >>>>>>>
> > > >>>>>>>> Is the 2 minute wait time acceptable? Should
we document it?
> > When
> > > we
> > > >>>>>>>>
> > > >>>>>>> revert
> > > >>>>>>>
> > > >>>>>>>> GEODE-5591, this issue does not happen.
> > > >>>>>>>>
> > > >>>>>>>> Regards
> > > >>>>>>>> Nabarun Nag
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>
> > > >>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message