zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Zk OOM in Critical Thread
Date Tue, 19 May 2015 16:38:54 GMT
Hi Austin, thanks for the report. Take a look at this thread if you haven't
already, there are a few candidate backports that we're looking at (fixed
in 3.5, but we're looking at backporting for 3.4).
http://markmail.org/message/pmiifnqgjozmbhkm There's also some suggestions
in there on how to mitigate.

Would be great if you could help test out the release given you've seen
this issue.



On Tue, May 19, 2015 at 8:10 AM, Miller, Austin <
Austin.Miller@morganstanley.com> wrote:

> Hi all,
> We had an event in our prod cluster where an OOM caused a leader node to
> effectively become corrupted while the rest of the ensemble thought it was
> healthy, thus permanently degrading the ensemble to provide read only
> service on existing sessions until a human intervented.
> Exceptions in Critical Threads
> ============
> As a tactical step, we've added an OOMHandler to bounce the node.
> However, we're cognizant of the fact that other exceptions in this space
> can cause this issue again.  There is also an interesting interaction with
> J8 which I will get to shortly.
> In this link:
> http://arstechnica.com/information-technology/2015/05/the-discovery-of-apache-zookeepers-poison-packet/
> (specifically bug #1) seems to apply to this issue.  I haven't extensively
> gone through the server code in some time, but will again shortly.  I'm
> wondering if this is seen as an issue by the zookeeper dev community and if
> there are plans to respond.
> OS: linux 64 bit
> Zk: 3.4.6
> jre: 1.8.31
> 2015-05-10 19:11:49,882 - ERROR
> [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2281:NIOServerCnxnFactory$1@44] -
> Thread Thread[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2281,5,main] died
> java.lang.OutOfMemoryError: Compressed class space
>         at java.lang.ClassLoader.defineClass1(Native Method)
>         at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>         at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>         at java.net.URLClassLoader.defineClass(URLClassLoader.java:455)
>         at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:367)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.makeLeader(QuorumPeer.java:605)
>         at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:798)
> Zookeeper and J8
> So while this all was occurring, the CCS space in J8 filled up.  This
> space is, by default, 1G.  For it to fill up feels surprising.  Maybe it
> was somehow due to lots of connections occurring.  This caused the OOM
> which caused the error in the leader thread.  I can't imagine what ZK
> server is doing to legitimately fill this space without instrumentation
> being involved somehow.  Or maybe J8 has a bug.  Any ideas on this would be
> appreciated.
> Austin
> ________________________________
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the
> opinions or views contained herein are not intended to be, and do not
> constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall
> Street Reform and Consumer Protection Act. If you have received this
> communication in error, please destroy all electronic and paper copies; do
> not disclose, use or act upon the information; and notify the sender
> immediately. Mistransmission is not intended to waive confidentiality or
> privilege. Morgan Stanley reserves the right, to the extent permitted under
> applicable law, to monitor electronic communications. This message is
> subject to terms available at the following link:
> http://www.morganstanley.com/disclaimers If you cannot access these
> links, please notify us by reply message and we will send the contents to
> you. By messaging with Morgan Stanley you consent to the foregoing.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message