hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: HBaseClient.call() hang
Date Sat, 15 Dec 2012 01:31:02 GMT
Hey Bryan, 

which version of HBase it this?

-- Lars

 From: Bryan Keller <bryanck@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Friday, December 14, 2012 2:59 PM
Subject: HBaseClient.call() hang
I have encountered a problem with HBaseClient.call() hanging. This occurs when one of my regionservers
goes down while performing a table scan.

What exacerbates this problem is that the scan I am performing uses filters, and the region
size of the table is large (4gb). Because of this, it can take several minutes for a row to
be returned when calling scanner.next(). Apparently there is no keep alive message being sent
back to the scanner while the region server is busy, so I had to increase the hbase.rpc.timeout
value to a large number (60 min), otherwise the next() call will timeout waiting for the regionserver
to send something back.

The result is that this HBaseClient.call() hang is made much worse, because it won't time
out for 60 minutes.

I have a couple of questions:

1. Any thoughts on why the HBaseClient.call() is getting stuck? I noticed that call.wait()
is not using any timeout so it will wait indefinitely until interrupted externally

2. Is there a solution where I do not need to set hbase.rpc.timeout to a very large number?
My only thought would be to forego using filters and do the filtering client side, which seems
pretty inefficient

Here is a stack dump of the thread that was hung:

Thread 10609: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Interpreted frame)
- java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
- org.apache.hadoop.hbase.ipc.HBaseClient.call(org.apache.hadoop.io.Writable, java.net.InetSocketAddress,
java.lang.Class, org.apache.hadoop.hbase.security.User, int) @bci=51, line=904 (Interpreted
- org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(java.lang.Object, java.lang.reflect.Method,
java.lang.Object[]) @bci=52, line=150 (Interpreted frame)
- $Proxy12.next(long, int) @bci=26 (Interpreted frame)
- org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=72, line=92 (Interpreted frame)
- org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=1, line=42 (Interpreted frame)
- org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(org.apache.hadoop.hbase.client.ServerCallable)
@bci=36, line=1325 (Interpreted frame)
- org.apache.hadoop.hbase.client.HTable$ClientScanner.next() @bci=117, line=1299 (Compiled
- org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue() @bci=41, line=150
(Interpreted frame)
- org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue() @bci=4, line=142 (Interpreted
- org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue() @bci=4, line=458
(Interpreted frame)
- org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue() @bci=4, line=76 (Interpreted
- org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue() @bci=4, line=85
(Interpreted frame)
- org.apache.hadoop.mapreduce.Mapper.run(org.apache.hadoop.mapreduce.Mapper$Context) @bci=6,
line=139 (Interpreted frame)
- org.apache.hadoop.mapred.MapTask.runNewMapper(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapreduce.split.JobSplit$TaskSplitIndex,
org.apache.hadoop.mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter)
@bci=201, line=645 (Interpreted frame)
- org.apache.hadoop.mapred.MapTask.run(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapred.TaskUmbilicalProtocol)
@bci=100, line=325 (Interpreted frame)
- org.apache.hadoop.mapred.Child$4.run() @bci=29, line=268 (Interpreted frame)
- java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext)
@bci=0 (Interpreted frame)
- javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction)
@bci=42, line=396 (Interpreted frame)
- org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
@bci=14, line=1332 (Interpreted frame)
- org.apache.hadoop.mapred.Child.main(java.lang.String[]) @bci=776, line=262 (Interpreted
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message