zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall McMullen <marshall.mcmul...@gmail.com>
Subject One ensemble node shows massive number of 'Outstanding' requests
Date Tue, 17 Feb 2015 18:01:48 GMT
Greetings,

We saw an issue recently that I've never seen before and am hoping I can
get some clarity on what may cause this and whether it's a known issue. We
had a 5 node ensemble and were unable to connect to one of the ZooKeeper
instances.  When trying to connect with zkCli it would timeout. When I
connected via telnet and issued the srvr four letter word, I was surprised
to see that this one server reported a massive number of 'Outstanding'
requests. I'd never seen that really be anything other than 0 before. On
the ZK dev guide it says:

"outstanding is the number of queued requests, this increases when the
server is under load and is receiving more sustained requests than it can
process, ie the request queue". I looked at all the ZK servers in my
ensemble:

for ip in 101 102 103 104 105; do echo srvr | nc 172.21.20.${ip} 2181 |
grep Outstanding; done
Outstanding: 0
Outstanding: 0
Outstanding: 0
Outstanding: 0
Outstanding: 18876

I eventually killed ZK on the affected server and everything corrected
itself and Outstanding went to zero and I was able to connect again.

Is this something anyone's familiar with? I have logs if it would be
helpful.

Thanks!

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message