hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Trunk hangs after a stop/start of RegionServer
Date Wed, 11 Mar 2015 11:07:09 GMT
Hi All

The latest trunk hangs after we do a stop and start of the Region Server
with the following error

org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via
stobdtserver3,16040,1426090566331:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
java.io.IOException: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/hbase/flush-table-proc/acquired/TestTable/stobdtserver3,16040,1426090566331
        at
org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:171)
        at
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.abort(ZKProcedureMemberRpcs.java:329)
        at
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.watchForAbortedProcedures(ZKProcedureMemberRpcs.java:142)
        at
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.start(ZKProcedureMemberRpcs.java:352)
        at
org.apache.hadoop.hbase.procedure.flush.RegionServerFlushTableProcedureManager.start(RegionServerFlushTableProcedureManager.java:102)
        at
org.apache.hadoop.hbase.procedure.RegionServerProcedureManagerHost.start(RegionServerProcedureManagerHost.java:53)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:882)
        at java.lang.Thread.run(Thread.java:745)
Caused by:
org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
java.io.IOException: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for
/hbase/flush-table-proc/acquired/TestTable/stobdtserver3,16040,1426090566331
        at
org.apache.hadoop.hbase.procedure.Subprocedure.cancel(Subprocedure.java:273)
        at
org.apache.hadoop.hbase.procedure.ProcedureMember.controllerConnectionFailure(ProcedureMember.java:225)
        at
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAcquired(ZKProcedureMemberRpcs.java:254)
        at
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:166)
        at
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


Even when we try to flush we get the above error. Because of this the
system hangs and we are not able to proceed with performing operations
particularly after we restart the region server.

I have a single RS and single master installation for internal testing. Any
hints on why this happens? It was not happening till the update that I had
taken 3 days back.

Regards
Ram

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message