geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nabarun Nag <n...@apache.org>
Subject Re: 2 minute gateway startup time due to GEODE-5591
Date Wed, 05 Sep 2018 16:40:58 GMT
If everyone is okay with it, I will revert that change in develop and then
cherry pick it to release/1.7.0 branch.
Please do comment.

Regards
Nabarun Nag


On Wed, Sep 5, 2018 at 9:30 AM Dan Smith <dsmith@pivotal.io> wrote:

> +1 to yank it and rework the fix.
>
> Gester's change helps, but it just means that you will sometimes randomly
> have a 2 minute delay starting up a gateway receiver. I don't think that is
> a great user experience either.
>
> -Dan
>
> On Wed, Sep 5, 2018 at 8:20 AM, Bruce Schuchardt <bschuchardt@pivotal.io>
> wrote:
>
> > Let's yank it
> >
> >
> >
> > On 9/4/18 5:04 PM, Sean Goller wrote:
> >
> >> If it's to get the release out, I'm fine with reverting. I don't like
> it,
> >> but I'm not willing to die on that hill. :)
> >>
> >> -S.
> >>
> >> On Tue, Sep 4, 2018 at 4:38 PM Dan Smith <dsmith@pivotal.io> wrote:
> >>
> >> Spitting this into a separate thread.
> >>>
> >>> I see the issue. The two minute timeout is the constructor for
> >>> AcceptorImpl, where it retries to bind for 2 minutes.
> >>>
> >>> That behavior makes sense for CacheServer.start.
> >>>
> >>> But it doesn't make sense for the new logic in GatewayReceiver.start()
> >>> from
> >>> GEODE-5591. That code is trying to use CacheServer.start to scan for an
> >>> available port, trying each port in a range. That free port finding
> logic
> >>> really doesn't want to have two minutes of retries for each port. It
> >>> seems
> >>> like we need to rework the fix for GEODE-5591.
> >>>
> >>> Does it make sense to hold up the release to rework this fix, or should
> >>> we
> >>> just revert it? Have we switched concourse over to using alpine linux,
> >>> which I think was the original motivation for this fix?
> >>>
> >>> -Dan
> >>>
> >>> On Tue, Sep 4, 2018 at 4:25 PM, Dan Smith <dsmith@pivotal.io> wrote:
> >>>
> >>> Why is it waiting at all in this case? Where is this 2 minute timeout
> >>>> coming from?
> >>>>
> >>>> -Dan
> >>>>
> >>>> On Tue, Sep 4, 2018 at 4:12 PM, Sai Boorlagadda <
> >>>>
> >>> sai.boorlagadda@gmail.com
> >>>
> >>>> wrote:
> >>>>> So the issue is that it takes longer to start than previous releases?
> >>>>> Also, is this wait time only when using Gfsh to create
> >>>>> gateway-receiver?
> >>>>>
> >>>>> On Tue, Sep 4, 2018 at 4:03 PM Nabarun Nag <nnag@apache.org>
wrote:
> >>>>>
> >>>>> Currently we have a minor issue in the release branch as pointed
out
> >>>>>>
> >>>>> by
> >>>
> >>>> Barry O.
> >>>>>> We will wait till a resolution is figured out for this issue.
> >>>>>>
> >>>>>> Steps:
> >>>>>> 1. create locator
> >>>>>> 2. start server --name=server1 --server-port=40404
> >>>>>> 3. start server --name=server2 --server-port=40405
> >>>>>> 4. create gateway-receiver --member=server1
> >>>>>> 5. create gateway-receiver --member=server2 `This gets stuck
for 2
> >>>>>>
> >>>>> minutes`
> >>>>>
> >>>>>> Is the 2 minute wait time acceptable? Should we document it?
When we
> >>>>>>
> >>>>> revert
> >>>>>
> >>>>>> GEODE-5591, this issue does not happen.
> >>>>>>
> >>>>>> Regards
> >>>>>> Nabarun Nag
> >>>>>>
> >>>>>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message