hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Keller <brya...@gmail.com>
Subject Re: HBaseClient.call() hang
Date Sat, 15 Dec 2012 05:29:07 GMT
Forgot to mention that. It's version 0.92.1 (Cloudera CDH4.1.1), running on CentOS 6 64 bit,
Java 1.6.0_31

On Dec 14, 2012, at 5:31 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> Hey Bryan, 
> 
> 
> which version of HBase it this?
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Bryan Keller <bryanck@gmail.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org> 
> Sent: Friday, December 14, 2012 2:59 PM
> Subject: HBaseClient.call() hang
> 
> I have encountered a problem with HBaseClient.call() hanging. This occurs when one of
my regionservers goes down while performing a table scan.
> 
> What exacerbates this problem is that the scan I am performing uses filters, and the
region size of the table is large (4gb). Because of this, it can take several minutes for
a row to be returned when calling scanner.next(). Apparently there is no keep alive message
being sent back to the scanner while the region server is busy, so I had to increase the hbase.rpc.timeout
value to a large number (60 min), otherwise the next() call will timeout waiting for the regionserver
to send something back.
> 
> The result is that this HBaseClient.call() hang is made much worse, because it won't
time out for 60 minutes.
> 
> I have a couple of questions:
> 
> 1. Any thoughts on why the HBaseClient.call() is getting stuck? I noticed that call.wait()
is not using any timeout so it will wait indefinitely until interrupted externally
> 
> 2. Is there a solution where I do not need to set hbase.rpc.timeout to a very large number?
My only thought would be to forego using filters and do the filtering client side, which seems
pretty inefficient
> 
> Here is a stack dump of the thread that was hung:
> 
> Thread 10609: (state = BLOCKED)
> - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
> - java.lang.Object.wait() @bci=2, line=485 (Interpreted frame)
> - org.apache.hadoop.hbase.ipc.HBaseClient.call(org.apache.hadoop.io.Writable, java.net.InetSocketAddress,
java.lang.Class, org.apache.hadoop.hbase.security.User, int) @bci=51, line=904 (Interpreted
frame)
> - org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(java.lang.Object, java.lang.reflect.Method,
java.lang.Object[]) @bci=52, line=150 (Interpreted frame)
> - $Proxy12.next(long, int) @bci=26 (Interpreted frame)
> - org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=72, line=92 (Interpreted
frame)
> - org.apache.hadoop.hbase.client.ScannerCallable.call() @bci=1, line=42 (Interpreted
frame)
> - org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(org.apache.hadoop.hbase.client.ServerCallable)
@bci=36, line=1325 (Interpreted frame)
> - org.apache.hadoop.hbase.client.HTable$ClientScanner.next() @bci=117, line=1299 (Compiled
frame)
> - org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue() @bci=41, line=150
(Interpreted frame)
> - org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue() @bci=4, line=142
(Interpreted frame)
> - org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue() @bci=4, line=458
(Interpreted frame)
> - org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue() @bci=4, line=76 (Interpreted
frame)
> - org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue() @bci=4, line=85
(Interpreted frame)
> - org.apache.hadoop.mapreduce.Mapper.run(org.apache.hadoop.mapreduce.Mapper$Context)
@bci=6, line=139 (Interpreted frame)
> - org.apache.hadoop.mapred.MapTask.runNewMapper(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapreduce.split.JobSplit$TaskSplitIndex,
org.apache.hadoop.mapred.TaskUmbilicalProtocol, org.apache.hadoop.mapred.Task$TaskReporter)
@bci=201, line=645 (Interpreted frame)
> - org.apache.hadoop.mapred.MapTask.run(org.apache.hadoop.mapred.JobConf, org.apache.hadoop.mapred.TaskUmbilicalProtocol)
@bci=100, line=325 (Interpreted frame)
> - org.apache.hadoop.mapred.Child$4.run() @bci=29, line=268 (Interpreted frame)
> - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
java.security.AccessControlContext) @bci=0 (Interpreted frame)
> - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction)
@bci=42, line=396 (Interpreted frame)
> - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction)
@bci=14, line=1332 (Interpreted frame)
> - org.apache.hadoop.mapred.Child.main(java.lang.String[]) @bci=776, line=262 (Interpreted
frame)


Mime
View raw message