zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Ammann <...@pyx.ch>
Subject MNTR command fails after upgrade to ZK 3.6.1
Date Tue, 08 Sep 2020 09:18:10 GMT
Hi all

We have recently upgraded our DEV installation from ZK 3.5.6 to 3.6.1.
We're using the MNTR command regularly to poll availability and
statistics information from our ZK nodes.

After the upgrade, we see that intermittently (but quite often) the
polling client (a Java program) fails to read the full output of the
MNTR command. It gets a connection reset during read of the MNTR
response [1]. I assume this is at least partly due to the fact that the
number of metrics reported by MNTR has massively increased in the 3.6
release (>500 compared to ~25 before)

When doing a network trace on the Zookeeper node, I can see that the
server PUSHes out the full response, sends a FIN, and the client seems
to be a bit slow with the ACKs. Almost immediately after the FIN, the ZK
node sends out a RST TCP packet [2], which then leads to the exception
on the client side.

My question is: Why is the ZK server immediately sending out a RST? Am I
doing something wrong on the client side? When using MNTR from Netcat or
similar tools, I see the full response. But also other tools (like
zktop.py) have intermittent issues with the enhanced MNTR command.

CU, Joe


java.net.SocketException: Connection reset
      at java.net.SocketInputStream.read(SocketInputStream.java:209)
      at java.net.SocketInputStream.read(SocketInputStream.java:141)
      at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
      at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
      at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
      at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
      at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
      at java.io.InputStreamReader.read(InputStreamReader.java:184)
      at java.io.Reader.read(Reader.java:140)
      at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2001)
      at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1980)
      at org.apache.commons.io.IOUtils.copy(IOUtils.java:1957)
      at org.apache.commons.io.IOUtils.copy(IOUtils.java:1907)


View raw message