ace-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert M. Mather" <robert.mather....@gmail.com>
Subject ACE server was unavailable, application was impacted strangely
Date Mon, 29 Jun 2015 09:49:52 GMT
Our hosting service where the ACE server runs was down for maintenance, so
our clients couldn't contact the ACE server for an extended period of time.
I've logged in to a few client sites (we have around 100) and I'm seeing
that the agents eventually blacklisted the server IP and stopped checking
there for updates, even though that's the only one we have. Once the server
came back online, they still didn't resume sychronizing with it. Is this
the correct behavior? Shouldn't the agent detect when the server is back
online and connect to it again?

I now see the "agent.discovery.checking" option, which I guess we should
set to false in the future.

The troubling part is that we have a DS component running in our client
application that pings the server periodically, but the clients all stopped
pinging after the server outage. In every log I checked, the pings stopped
immediately after the ACE agent blacklisted the server IP. The ping is just
a task running under the standard Java ScheduledExecutorService that POSTs
to our server every few minutes using the Apache HttpClient. Is it possible
that the ACE agent could interfere with that somehow? The service running
the ping task doesn't log that it got stopped or failed in any way. Other
services on the client are working normally.

After restarting the a few client processes, they all reconnected to ACE
and started pinging normally again.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message