hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: client timeout
Date Thu, 04 Dec 2014 23:41:48 GMT
Only on that one region server? Weird. Does this persist when you bounce it?

      From: Ted Tuttle <ted@mentacapital.com>
 To: lars hofhansl <larsh@apache.org>; "user@hbase.apache.org" <user@hbase.apache.org>

Cc: Development <Development@mentacapital.com> 
 Sent: Wednesday, December 3, 2014 1:21 PM
 Subject: RE: client timeout
   
Still on v0.94.16

We are seeing loads of these:

2014-12-03 12:28:32,696 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call multi(org.apache.hadoop.hbase.client.MultiAction@55428f05<mailto:org.apache.hadoop.hbase.client.MultiAction@55428f05>),
rpc version=1, client version=29, methodsFingerPrint=-540141542 from <ip>:<port>after
131914 ms, since caller disconnected
        at org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3944)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3854)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3835)
        at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3878)
        at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4804)
        at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4777)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2194)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3754)
        at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)


From: lars hofhansl [mailto:larsh@apache.org]
Sent: Wednesday, December 03, 2014 11:31 AM
To: user@hbase.apache.org
Cc: Development
Subject: Re: client timeout

Bad disk or network?

Anything in the logs (HBase, HDFS, and System logs)?

HBase 0.94, still?
The easiest way to just kill the region servers, the others will pick up the regions.

-- Lars

________________________________
From: Ted Tuttle <ted@mentacapital.com<mailto:ted@mentacapital.com>>
To: "user@hbase.apache.org<mailto:user@hbase.apache.org>" <user@hbase.apache.org<mailto:user@hbase.apache.org>>
Cc: Development <Development@mentacapital.com<mailto:Development@mentacapital.com>>


Sent: Wednesday, December 3, 2014 7:13 AM
Subject: client timeout

Hello-

We are seeing recurring timeouts in communications with one our RSs.  The error we see in
our logs is:

Caused by: java.net.SocketTimeoutException: Call to <rs host>./<rs ip>:<port>
failed on socket timeout exceptio\
n: java.net.SocketTimeoutException: 120000 millis timeout while waiting for channel to be
ready for read. ch : java.nio.channels.S\
ocketChannel[connected local=/<client ip>:<port> remote=<rs host>./<rs
ip>:<port>]
        at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1043)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1016)
        at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87)
        at com.sun.proxy.$Proxy9.multi(Unknown Source)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1537)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1535)
        at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:229)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1544)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1532)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

Any ideas on what could be wrong w/ this RS?  The RS is not unusually busy.

Thanks,
Ted



  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message