hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-11282) Load balancer may move a region which is participating in snapshot
Date Mon, 02 Jun 2014 16:11:03 GMT
Ted Yu created HBASE-11282:
------------------------------

             Summary: Load balancer may move a region which is participating in snapshot
                 Key: HBASE-11282
                 URL: https://issues.apache.org/jira/browse/HBASE-11282
             Project: HBase
          Issue Type: Bug
            Reporter: Ted Yu


The region was tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.
>From master log:
{code}
2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Found an
existing plan for tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.       destination
server is h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal,60020,1394494963812 accepted
as a dest server = true
2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Using pre-existing
plan for tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.;     plan=hri=tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.,
src=h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165, dest=h2-ubuntu12-sec-
    1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
2014-03-10 23:48:09,035 INFO  [AM.ZK.Worker-pool2-t42] master.RegionStates: Transitioned {289ebdee6adf0a3b9c2bbcbe2ff522e7
state=CLOSED, ts=1394495289035, server=h2-       ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165}
to {289ebdee6adf0a3b9c2bbcbe2ff522e7 state=OFFLINE, ts=1394495289035, server=h2-ubuntu12-sec-
       1394425849-hbase-9.cs1cloud.internal,60020,1394494962165}
2014-03-10 23:48:09,035 DEBUG [AM.ZK.Worker-pool2-t42] zookeeper.ZKAssign: master:60000-0x244aa9920190b04,
quorum=h2-ubuntu12-sec-1394425849-hbase-8.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-1.cs1cloud.internal:2181,h2-ubuntu12-sec-1394425849-hbase-4.cs1cloud.internal:2181,
baseZNode=/hbase Creating (or updating) unassigned     node 289ebdee6adf0a3b9c2bbcbe2ff522e7
with OFFLINE state
2014-03-10 23:48:09,044 INFO  [AM.ZK.Worker-pool2-t42] master.AssignmentManager: Assigning
tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. to h2-ubuntu12-sec-    1394425849-hbase-4.cs1cloud.internal,60020,1394494963812
{code}
>From hbase-hbase-regionserver-h2-ubuntu12-sec-1394425849-hbase-9.log :
{code}
2014-03-10 23:48:08,487 WARN  [member: 'h2-ubuntu12-sec-1394425849-hbase-9.cs1cloud.internal,60020,1394494962165'
subprocedure-pool1-thread-1] snapshot.                    RegionServerSnapshotManager: Got
Exception in SnapshotSubprocedurePool
java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.NotServingRegionException:
tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7. is closing
  at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
  at java.util.concurrent.FutureTask.get(FutureTask.java:83)
  at org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:325)
  at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
  at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
  at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
  at org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.hbase.NotServingRegionException: tableone,,1394495094967.289ebdee6adf0a3b9c2bbcbe2ff522e7.
is closing
  at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5699)
  at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5663)
  at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
  at org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:65)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
{code}
Load balancer's move of the underlying region caused FlushSnapshotSubprocedure to fail.

Mechanism of making load balancer be aware of region operation is desirable such that snapshot
doesn't fail due to the above scenario.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message