hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4341) HRS#closeAllRegions should take care of HRS#onlineRegions's weak consistency
Date Wed, 07 Sep 2011 06:33:10 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098646#comment-13098646
] 

stack commented on HBASE-4341:
------------------------------

You are a good man.

> HRS#closeAllRegions should take care of HRS#onlineRegions's weak consistency
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4341
>                 URL: https://issues.apache.org/jira/browse/HBASE-4341
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.4
>            Reporter: Jieshan Bean
>            Assignee: Jieshan Bean
>             Fix For: 0.90.5
>
>
> This's the reason of why did "https://builds.apache.org/job/hbase-0.90/282" get failure
. In this test, one case was timeout and cause the whole test process got killed.
> [logs]
> Here's the related logs(From org.apache.hadoop.hbase.mapreduce.TestTableMapReduce-output.txt):
> {noformat}
> 2011-08-31 10:09:01,089 INFO  [RegionServer:0;vesta.apache.org,52257,1314785332968.leaseChecker]
regionserver.Leases(124): RegionServer:0;vesta.apache.org,52257,1314785332968.leaseChecker
closing leases
> 2011-08-31 10:09:01,089 INFO  [RegionServer:0;vesta.apache.org,52257,1314785332968.leaseChecker]
regionserver.Leases(131): RegionServer:0;vesta.apache.org,52257,1314785332968.leaseChecker
closed leases
> 2011-08-31 10:09:01,403 INFO  [RegionServer:0;vesta.apache.org,52257,1314785332968] regionserver.HRegionServer(709):
Waiting on 1 regions to close
> 2011-08-31 10:09:01,403 DEBUG [RegionServer:0;vesta.apache.org,52257,1314785332968] regionserver.HRegionServer(713):
{74a7a8befdf9561dc1d90c4241afeac7=mrtest,uuu,1314785328546.74a7a8befdf9561dc1d90c4241afeac7.}
> 2011-08-31 10:09:01,697 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> 2011-08-31 10:09:02,697 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> 2011-08-31 10:09:03,008 INFO  [vesta.apache.org:50036.timeoutMonitor] hbase.Chore(79):
vesta.apache.org:50036.timeoutMonitor exiting
> 2011-08-31 10:09:03,697 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> 2011-08-31 10:09:04,697 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> 2011-08-31 10:09:05,698 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> 2011-08-31 10:09:06,698 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> 2011-08-31 10:09:07,698 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> {noformat}
> [Analysis]
> One region was opened during the RS's stopping. 
> This is method of "HRS#closeAllRegions":
> {noformat}
>   protected void closeAllRegions(final boolean abort) {
>     closeUserRegions(abort);
>     -------------------------
>     if (meta != null) closeRegion(meta.getRegionInfo(), abort, false);
>     if (root != null) closeRegion(root.getRegionInfo(), abort, false);
>   }
> {noformat}
> HRS#onlineRegions is a ConcurrentHashMap. So walk down this map may not get all the data
if some entries are been added during the traverse. Once one region was missed, it can't be
closed anymore. And this regionserver will not be stopped normally. Then the following logs
occurred:
> {noformat}
> 2011-08-31 10:09:01,403 INFO  [RegionServer:0;vesta.apache.org,52257,1314785332968] regionserver.HRegionServer(709):
Waiting on 1 regions to close
> 2011-08-31 10:09:01,403 DEBUG [RegionServer:0;vesta.apache.org,52257,1314785332968] regionserver.HRegionServer(713):
{74a7a8befdf9561dc1d90c4241afeac7=mrtest,uuu,1314785328546.74a7a8befdf9561dc1d90c4241afeac7.}
> 2011-08-31 10:09:01,697 INFO  [Master:0;vesta.apache.org:50036] master.ServerManager(465):
Waiting on regionserver(s) to go down vesta.apache.org,52257,1314785332968
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message