ace-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert M. Mather" <robert.mather....@gmail.com>
Subject Re: ACE server was unavailable, application was impacted strangely
Date Mon, 29 Jun 2015 16:56:27 GMT
On Mon, Jun 29, 2015 at 3:58 AM, Jan Willem Janssen <
janwillem.janssen@luminis.eu> wrote:

> Hi Robert,
>
> > On 29 Jun 2015, at 11:49, Robert M. Mather <robert.mather.rmm@gmail.com>
> wrote:
> >
> > Our hosting service where the ACE server runs was down for maintenance,
> so
> > our clients couldn't contact the ACE server for an extended period of
> time.
> > I've logged in to a few client sites (we have around 100) and I'm seeing
> > that the agents eventually blacklisted the server IP and stopped checking
> > there for updates, even though that's the only one we have. Once the
> server
> > came back online, they still didn't resume sychronizing with it. Is this
> > the correct behavior? Shouldn't the agent detect when the server is back
> > online and connect to it again?
> >
> > I now see the "agent.discovery.checking" option, which I guess we should
> > set to false in the future.
>
> IMO, this is a bug: it makes no sense to blacklist a server when there is
> only
> one the agent can talk to. Could you raise an issue for this on JIRA?
>

Sure, I'll file an issue. Until the bug is fixed, is there some way I can
prevent issues from occurring in the future if the ACE server becomes
unavailable again? Would setting "agent.discovery.checking=false" prevent
the blacklisting?

(The idea of blacklisting is to create a crude form of failover: suppose
> you’ve
> multiple ACE servers up and running, a client could try each one of them in
> case on of them is not accessible.)
>
> > The troubling part is that we have a DS component running in our client
> > application that pings the server periodically, but the clients all
> stopped
> > pinging after the server outage. In every log I checked, the pings
> stopped
> > immediately after the ACE agent blacklisted the server IP. The ping is
> just
> > a task running under the standard Java ScheduledExecutorService that
> POSTs
> > to our server every few minutes using the Apache HttpClient. Is it
> possible
> > that the ACE agent could interfere with that somehow? The service running
> > the ping task doesn't log that it got stopped or failed in any way. Other
> > services on the client are working normally.
>
> How does your job obtain the server IP? Through the DiscoveryHandler of the
> agent itself? If so, than this might be the culprit as it no longer returns
> the IP of the server since it is blacklisted, and there are no alternative
> server IPs to return...
>

It's completely independent of the agent service, and I can't think of any
reason why this would happen without knowing more about the internals of
the agent.

>
> HtH,
>
> --
> Met vriendelijke groeten | Kind regards
>
> Jan Willem Janssen | Software Architect
> +31 631 765 814
>
> My world is revolving around INAETICS and Amdatu
>
> Luminis Technologies B.V.
> Churchillplein 1
> 7314 BZ   Apeldoorn
> +31 88 586 46 00
>
> http://www.luminis-technologies.com
> http://www.luminis.eu
>
> KvK (CoC) 09 16 28 93
> BTW (VAT) NL8169.78.566.B.01
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message