hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11277) RPCServer threads can wedge under high load
Date Fri, 30 May 2014 18:29:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014065#comment-14014065

Andrew Purtell commented on HBASE-11277:

I'd say the runaway loop is RpcServer.Connection#readAndProcess(). It is constructed with
a top level while (true) loop so if ever we miss all of the coded exit conditions we will
iterate forever.

Like 1488 corresponds to a call to channelRead(). This is after "We have read a length and
we have read the preamble.  It is either the connection header or it is a request."  Given
the observed behavior, I think we are going to the unconditional else clause where "More to
read still; go around again.", and going around, and around, and around.

> RPCServer threads can wedge under high load
> -------------------------------------------
>                 Key: HBASE-11277
>                 URL: https://issues.apache.org/jira/browse/HBASE-11277
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
> This is with 0.98.0 in an insecure setup with 7u55 and 7u60. Under high load, RPCServer
threads can wedge, fail to make progess, and consume 100% CPU time on a core indefinitely.

> Dumping threads, all threads are in BLOCKED or IN_NATIVE state. The IN_NATIVE threads
are mostly in EPollArrayWrapper.epollWait or FileDispatcherImpl.read0. The number of threads
found in FileDispatcherImpl.read0 correspond to the number of runaway threads expected based
on looking at 'top' output. These look like:
> {noformat}
> Thread 64758: (state = IN_NATIVE)
>  - sun.nio.ch.FileDispatcherImpl.read0(java.io.FileDescriptor, long, int) @bci=0 (Compiled
frame; information may be imprecise)
>  - sun.nio.ch.SocketDispatcher.read(java.io.FileDescriptor, long, int) @bci=4, line=39
(Compiled frame)
>  - sun.nio.ch.IOUtil.readIntoNativeBuffer(java.io.FileDescriptor, java.nio.ByteBuffer,
long, sun.nio.ch.NativeDispatcher) @bci=114, line=223 (Compil
> ed frame)
>  - sun.nio.ch.IOUtil.read(java.io.FileDescriptor, java.nio.ByteBuffer, long, sun.nio.ch.NativeDispatcher)
@bci=48, line=197 (Compiled frame)
>  - sun.nio.ch.SocketChannelImpl.read(java.nio.ByteBuffer) @bci=234, line=379 (Compiled
>  - org.apache.hadoop.hbase.ipc.RpcServer.channelRead(java.nio.channels.ReadableByteChannel,
java.nio.ByteBuffer) @bci=12, line=2224 (Compiled frame)
>  - org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess() @bci=509, line=1488
(Compiled frame)
>  - org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(java.nio.channels.SelectionKey)
@bci=23, line=790 (Compiled frame)
>  - org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop() @bci=97, line=581
(Compiled frame)
>  - org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run() @bci=1, line=556 (Interpreted
>  - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
@bci=95, line=1145 (Interpreted frame)
>  - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted
>  - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
> {noformat}

This message was sent by Atlassian JIRA

View raw message