geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lynn Hughes-Godfrey <lhughesgodf...@pivotal.io>
Subject Re: 2 minute gateway startup time due to GEODE-5591
Date Wed, 05 Sep 2018 17:01:02 GMT
+1 for reverting in both places.

On Wed, Sep 5, 2018 at 9:50 AM, Dan Smith <dsmith@pivotal.io> wrote:

> +1 for reverting in both places. The current fix is not better, that's why
> we are reverting it on the release branch!
>
> -Dan
>
> On Wed, Sep 5, 2018 at 9:47 AM, Jacob Barrett <jbarrett@pivotal.io> wrote:
>
> > I’m not ok with reverting in develop. Revert in 1.7 and modify in
> develop.
> > We shouldn’t go backwards in develop. The current fix is better than the
> > bug it fixes.
> >
> > > On Sep 5, 2018, at 9:40 AM, Nabarun Nag <nnag@apache.org> wrote:
> > >
> > > If everyone is okay with it, I will revert that change in develop and
> > then
> > > cherry pick it to release/1.7.0 branch.
> > > Please do comment.
> > >
> > > Regards
> > > Nabarun Nag
> > >
> > >
> > >> On Wed, Sep 5, 2018 at 9:30 AM Dan Smith <dsmith@pivotal.io> wrote:
> > >>
> > >> +1 to yank it and rework the fix.
> > >>
> > >> Gester's change helps, but it just means that you will sometimes
> > randomly
> > >> have a 2 minute delay starting up a gateway receiver. I don't think
> > that is
> > >> a great user experience either.
> > >>
> > >> -Dan
> > >>
> > >> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <
> > bschuchardt@pivotal.io>
> > >> wrote:
> > >>
> > >>> Let's yank it
> > >>>
> > >>>
> > >>>
> > >>>> On 9/4/18 5:04 PM, Sean Goller wrote:
> > >>>>
> > >>>> If it's to get the release out, I'm fine with reverting. I don't
> like
> > >> it,
> > >>>> but I'm not willing to die on that hill. :)
> > >>>>
> > >>>> -S.
> > >>>>
> > >>>> On Tue, Sep 4, 2018 at 4:38 PM Dan Smith <dsmith@pivotal.io>
wrote:
> > >>>>
> > >>>> Spitting this into a separate thread.
> > >>>>>
> > >>>>> I see the issue. The two minute timeout is the constructor
for
> > >>>>> AcceptorImpl, where it retries to bind for 2 minutes.
> > >>>>>
> > >>>>> That behavior makes sense for CacheServer.start.
> > >>>>>
> > >>>>> But it doesn't make sense for the new logic in
> > GatewayReceiver.start()
> > >>>>> from
> > >>>>> GEODE-5591. That code is trying to use CacheServer.start to
scan
> for
> > an
> > >>>>> available port, trying each port in a range. That free port
finding
> > >> logic
> > >>>>> really doesn't want to have two minutes of retries for each
port.
> It
> > >>>>> seems
> > >>>>> like we need to rework the fix for GEODE-5591.
> > >>>>>
> > >>>>> Does it make sense to hold up the release to rework this fix,
or
> > should
> > >>>>> we
> > >>>>> just revert it? Have we switched concourse over to using alpine
> > linux,
> > >>>>> which I think was the original motivation for this fix?
> > >>>>>
> > >>>>> -Dan
> > >>>>>
> > >>>>> On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith <dsmith@pivotal.io>
> wrote:
> > >>>>>
> > >>>>> Why is it waiting at all in this case? Where is this 2 minute
> timeout
> > >>>>>> coming from?
> > >>>>>>
> > >>>>>> -Dan
> > >>>>>>
> > >>>>>> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <
> > >>>>>>
> > >>>>> sai.boorlagadda@gmail.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>> So the issue is that it takes longer to start than
previous
> > releases?
> > >>>>>>> Also, is this wait time only when using Gfsh to create
> > >>>>>>> gateway-receiver?
> > >>>>>>>
> > >>>>>>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag <nnag@apache.org>
> > wrote:
> > >>>>>>>
> > >>>>>>> Currently we have a minor issue in the release branch
as pointed
> > out
> > >>>>>>>>
> > >>>>>>> by
> > >>>>>
> > >>>>>> Barry O.
> > >>>>>>>> We will wait till a resolution is figured out for
this issue.
> > >>>>>>>>
> > >>>>>>>> Steps:
> > >>>>>>>> 1. create locator
> > >>>>>>>> 2. start server --name=server1 --server-port=40404
> > >>>>>>>> 3. start server --name=server2 --server-port=40405
> > >>>>>>>> 4. create gateway-receiver --member=server1
> > >>>>>>>> 5. create gateway-receiver --member=server2 `This
gets stuck
> for 2
> > >>>>>>>>
> > >>>>>>> minutes`
> > >>>>>>>
> > >>>>>>>> Is the 2 minute wait time acceptable? Should we
document it?
> When
> > we
> > >>>>>>>>
> > >>>>>>> revert
> > >>>>>>>
> > >>>>>>>> GEODE-5591, this issue does not happen.
> > >>>>>>>>
> > >>>>>>>> Regards
> > >>>>>>>> Nabarun Nag
> > >>>>>>>>
> > >>>>>>>>
> > >>>
> > >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message