hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-3381) Interrupt of a region open comes across as a successful open
Date Tue, 21 Dec 2010 23:50:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-3381:
-------------------------

    Attachment: 3381.txt

Here is the patch.  I've not been able to repro the condition during last few hours of testing
so would like to commit this (need a +1 -- Jon?).  While in here, I did some cleanup of hbck
messages and stopped it claiming error when offlined split parent.  Also added logging around
fixup of case where parent offlining edit got in but not daughter addtions; needed debugging.

> Interrupt of a region open comes across as a successful open
> ------------------------------------------------------------
>
>                 Key: HBASE-3381
>                 URL: https://issues.apache.org/jira/browse/HBASE-3381
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.90.0
>
>         Attachments: 3381.txt
>
>
> Meta was offline when below happened:
> {code}
> 2010-12-21 19:45:23,023 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12d0a53c540000e
Attempting to transition node 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to
RS_ZK_REGION_OPENING
> 2010-12-21 19:45:23,046 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12d0a53c540000e
Successfully transitioned node 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING
to RS_ZK_REGION_OPENING
> 2010-12-21 19:45:26,379 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
Interrupting thread Thread[PostOpenDeployTasks:337038b50e467fbd6b031f278bbd9c22,5,main]
> 2010-12-21 19:45:26,379 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12d0a53c540000e
Attempting to transition node 337038b50e467fbd6b031f278bbd9c22 from RS_ZK_REGION_OPENING to
RS_ZK_REGION_OPENED
> 2010-12-21 19:45:26,381 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
Exception running postOpenDeployTasks; region=337038b50e467fbd6b031f278bbd9c22
> org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Interrupted
>     at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364)
>     at org.apache.hadoop.hbase.catalog.MetaEditor.updateRegionLocation(MetaEditor.java:146)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.postOpenDeployTasks(HRegionServer.java:1331)
>     at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler$PostOpenDeployTasksThread.run(OpenRegionHandler.java:195)
> ...
> {code}
> So, we timed out trying to open the region but rather than close the region because edit
failed, we missed seeing the InterruptedException.
> Here is suggested fix:
> {code}
> diff --git a/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java b/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> index 7bf680d..2b0078c 100644
> --- a/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> +++ b/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
> @@ -339,7 +339,7 @@ public class MetaReader {
>      get.addFamily(HConstants.CATALOG_FAMILY);
>      byte [] meta = getCatalogRegionNameForRegion(regionName);
>      Result r = catalogTracker.waitForMetaServerConnectionDefault().get(meta, get);
> -    if(r == null || r.isEmpty()) {
> +    if (r == null || r.isEmpty()) {
>        return null;
>      }
>      return metaRowToRegionPair(r);
> {code}
> Let me try it.
> W/o this, what we see is hbck showing that region is on server X but in .META. it shows
as being on Y (its pre-balance server)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message