hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter King <wal...@adroll.com>
Subject Re: IPC Queue Size
Date Fri, 08 Aug 2014 06:32:07 GMT
Yes, sorry, CallQueueTooBigException. but that value never returns to zero,
even when number of requests goes to zero.  The call queue too big happens
if any regionserver is up for a long enough period of time, so I have to
periodically restart them.  Also at that 15:30 time I wasn't seeing that
exception, but it seems like that is one time in which a call didnt
properly decrement the callqueuesize because it was at zero before and has
never hit zero again - today the minimum is even higher.


On Thu, Aug 7, 2014 at 9:14 PM, Qiang Tian <tianq01@gmail.com> wrote:

> bq. "Eventually we ran into ipc queue size full messages being returned to
> clients trying large batch puts, as it approaches a gigabyte."
>
> Do you mean CallQueueTooBigException? it looks not the queue size, but the
> data size that client sends..configured by
> "hbase.ipc.server.max.callqueue.size".
>
> I guess when you client got the exception, it closed the exception and
> causing other shared connection RPC failed.
>
>
> 2014-08-06 22:27:57,253 WARN  [RpcServer.reader=9,port=60020] ipc.RpcServer
> (RpcServer.java:doRead(794)) - RpcServer.listener,port=60020: count of
> bytes read: 0
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> at org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2229)
> at
>
> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1415)
> at
> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:790)
> at
>
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:581)
> at
>
> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:556)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> 2014-08-06 22:27:57,257 WARN  [RpcServer.handler=18,port=60020]
> ipc.RpcServer (RpcServer.java:processResponse(1041)) -
> RpcServer.respondercallId: 84968 service: ClientService methodName: Multi
> size: 17.7 K connection: 10.248.130.152:49780: output error
> 2014-08-06 22:27:57,258 WARN  [RpcServer.handler=18,port=60020]
> ipc.RpcServer (CallRunner.java:run(135)) - RpcServer.handler=18,port=60020:
> caught a ClosedChannelException, this means that the server was processing
> a request but the client went away. The error message was: null
> 2014-08-06 22:27:57,260 WARN  [RpcServer.handler=61,port=60020]
> ipc.RpcServer (RpcServer.java:processResponse(1041)) -
> RpcServer.respondercallId: 83907 service: ClientService methodName: Multi
> size: 17.1 K connection: 10.248.1.56:53615: output error
> 2014-08-06 22:27:57,263 WARN  [RpcServer.handler=61,port=60020]
> ipc.RpcServer (CallRunner.java:run(135)) - RpcServer.handler=61,port=60020:
> caught a ClosedChannelException, this means that the server was processing
> a request but the client went away. The error message was: null
>
>
>
> On Fri, Aug 8, 2014 at 2:57 AM, Walter King <walter@adroll.com> wrote:
>
> >
> https://gist.github.com/walterking/4c5c6f5e5e4a4946a656#file-gistfile1-txt
> >
> > http://adroll-test-sandbox.s3.amazonaws.com/regionserver.stdout.log.gz
> >
> > These are logs from that particular server, and the debug dump from
> now(no
> > restart in between).  The times in the graph are pacific, so it should be
> > around 2014-08-06 22:25:00.  I do see some exceptions around there.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message