hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Levin <magn...@gmail.com>
Subject Re: REST servers locked up on single RS malfunction.
Date Thu, 21 Apr 2011 07:47:30 GMT
Shouldn't the RS just shutdown then?  Because it stays half alive and
none of the puts succeed.  Also the oome happen right after
flush/compaction/split... so clearly the RS was busy, and it could be
just a matter of hitting Heap ceiling perhaps.

-Jack

On Thu, Apr 21, 2011 at 12:13 AM, Stack <stack@duboce.net> wrote:
> This looks like a bug.  Elsewhere in the RPC you can register a
> handler for OOME explicitly and we have a callback up into the
> regionserver where we will set that the server abort or stop dependent
> on type of OOME we see.  In this case it looks like on OOME we just
> throw and the then all the executors fill so no more executors
> available to process requests (This is my current accessment -- it
> could be a different one by morning).
>
> The root cause would look to be a big put.  Could that be the case.
>
> On the naming, that looks to be the default naming of executor threads
> done by the hosting executorservice.
>
> St.Ack
>
>
> On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <magnito@gmail.com> wrote:
>> Hello, with 0.89 HBASE, we see the following, all REST servers get
>> locked on trying to connect to one of our RS servers, the error in the
>> .out file on that Region Server looks like this:
>>
>> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java
>> heap space
>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>>
>> Question is, how come the region server did not die after this but
>> just hogged the REST connections?  And what is pool1-thread-3 actually
>> do?
>>
>> -Jack
>>
>

Mime
View raw message