hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7711) rowlock release problem with thread interruptions in batchMutate
Date Tue, 29 Jan 2013 19:53:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565721#comment-13565721
] 

Jonathan Hsieh commented on HBASE-7711:
---------------------------------------

Here's some other places to consider looking at.

{code}
diff --git a/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java b/src/main/java/org/apache/hadoop/hbase/regionserver
index 4ccd4d2..c198069 100644
--- a/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
+++ b/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
@@ -2212,6 +2212,7 @@ public class HRegion implements HeapSize { // , Writable{
       }
 
       this.updatesLock.readLock().lock();
+      // TODO where is the try-finally?
       locked = true;
 
       //
@@ -2334,6 +2335,7 @@ public class HRegion implements HeapSize { // , Writable{
         this.updatesLock.readLock().unlock();
       }
 
+      // TODO why isn't this in a finally?
       if (acquiredLocks != null) {
         for (Integer toRelease : acquiredLocks) {
           releaseRowLock(toRelease);

{code}
                
> rowlock release problem with thread interruptions in batchMutate
> ----------------------------------------------------------------
>
>                 Key: HBASE-7711
>                 URL: https://issues.apache.org/jira/browse/HBASE-7711
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jonathan Hsieh
>
> An earlier version of snapshots would thread interrupt operations.  In longer term testing
we ran into an exception stack trace that indicated that a rowlock was taken an never released.
> {code}
> 2013-01-26 01:54:56,417 ERROR org.apache.hadoop.hbase.procedure.ProcedureMember: Propagating
foreign exception to subprocedure pe-1
> org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via timer-java.util.Timer@1cea3151:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
org.apache.hadoop.hbase.errorhandling.TimeoutException: Timeout elapsed! Source:Timeout caused
Foreign E
> xception Start:1359194035004, End:1359194095004, diff:60000, max:60000 ms
>         at org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:184)
>         at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.abort(ZKProcedureMemberRpcs.java:321)
>         at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.watchForAbortedProcedures(ZKProcedureMemberRpcs.java:150)
>         at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$200(ZKProcedureMemberRpcs.java:56)
>         at org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:112)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:315)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> Caused by: org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: org.apache.hadoop.hbase.errorhandling.TimeoutException:
Timeout elapsed! Source:Timeout caused Foreign Exception Start:1359194035004, End:1359194095004,
diff:60000, max:60000 ms
>         at org.apache.hadoop.hbase.errorhandling.TimeoutExceptionInjector$1.run(TimeoutExceptionInjector.java:71)
>         at java.util.TimerThread.mainLoop(Timer.java:512)
>         at java.util.TimerThread.run(Timer.java:462)
> 2013-01-26 01:54:56,648 WARN org.apache.hadoop.hbase.regionserver.HRegion: Failed getting
lock in batch put, row=0001558252
> java.io.IOException: Timed out on getting lock for row=0001558252
>         at org.apache.hadoop.hbase.regionserver.HRegion.internalObtainRowLock(HRegion.java:3239)
>         at org.apache.hadoop.hbase.regionserver.HRegion.getLock(HRegion.java:3315)
>         at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2150)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2021)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3511)
>         at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
> ....
> .. every snapshot attempt that used this region for the next two days encountered this
problem.
> {code}
> Snapshots will now bypass this problem with the fix in HBASE-7703.  However, we should
make sure hbase regionserver operations are safe when interrupted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message