zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <fpjunque...@yahoo.com.INVALID>
Subject Re: One ensemble node shows massive number of 'Outstanding' requests
Date Tue, 17 Feb 2015 21:38:24 GMT
It doesn't ring a bell, but it might be worth having a look at the logs to see if there is
anything unusual. 

Just to clarify, was the number of outstanding requests growing, constant? I suppose the server
was following/leading and operations were going through, otherwise it'd have dropped the connection
to the leader or leadership.

-Flavio 

> On 17 Feb 2015, at 18:01, Marshall McMullen <marshall.mcmullen@gmail.com> wrote:
> 
> Greetings,
> 
> We saw an issue recently that I've never seen before and am hoping I can
> get some clarity on what may cause this and whether it's a known issue. We
> had a 5 node ensemble and were unable to connect to one of the ZooKeeper
> instances.  When trying to connect with zkCli it would timeout. When I
> connected via telnet and issued the srvr four letter word, I was surprised
> to see that this one server reported a massive number of 'Outstanding'
> requests. I'd never seen that really be anything other than 0 before. On
> the ZK dev guide it says:
> 
> "outstanding is the number of queued requests, this increases when the
> server is under load and is receiving more sustained requests than it can
> process, ie the request queue". I looked at all the ZK servers in my
> ensemble:
> 
> for ip in 101 102 103 104 105; do echo srvr | nc 172.21.20.${ip} 2181 |
> grep Outstanding; done
> Outstanding: 0
> Outstanding: 0
> Outstanding: 0
> Outstanding: 0
> Outstanding: 18876
> 
> I eventually killed ZK on the affected server and everything corrected
> itself and Outstanding went to zero and I was able to connect again.
> 
> Is this something anyone's familiar with? I have logs if it would be
> helpful.
> 
> Thanks!


Mime
View raw message