hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heng Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14820) Region becomes unavailable after a region split is rolled back
Date Tue, 17 Nov 2015 23:48:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009859#comment-15009859
] 

Heng Chen commented on HBASE-14820:
-----------------------------------

Oh,  as region server log shows,  when split rollback,   split state is at OFFLINED_PARENT

And the logic in rollback when state at OFFLINED_PARENT is just add parent region back to
online.  But parent region has been closed when state as CLOSED_PARENT_REGION.  
Relates code below

{code}
  case CLOSED_PARENT_REGION:
        try {
          // So, this returns a seqid but if we just closed and then reopened, we
          // should be ok. On close, we flushed using sequenceid obtained from
          // hosting regionserver so no need to propagate the sequenceid returned
          // out of initialize below up into regionserver as we normally do.
          // TODO: Verify.
          this.parent.initialize();
        } catch (IOException e) {
          LOG.error("Failed rollbacking CLOSED_PARENT_REGION of region " +
            parent.getRegionInfo().getRegionNameAsString(), e);
          throw new RuntimeException(e);
        }
        break;
     .......
      case OFFLINED_PARENT:
        if (services != null) services.addToOnlineRegions(this.parent);
        break;
{code}


IMO before we add parent region back to online,  we should do {{this.parent.initialize()}}.
  Thoughts?

> Region becomes unavailable after a region split is rolled back
> --------------------------------------------------------------
>
>                 Key: HBASE-14820
>                 URL: https://issues.apache.org/jira/browse/HBASE-14820
>             Project: HBase
>          Issue Type: Bug
>          Components: master, regionserver
>    Affects Versions: 0.98.15
>            Reporter: Clara Xiong
>         Attachments: HBASE-14820-RegionServer.log, HBSE-14820-hmaster.log
>
>
> After the region server rolls back a timed out attempt of  region split, the region becomes
unavailable. 
> Symptoms:
> The RS displays the region open in the web UI.
> The meta table still points to the RS
> Requests for the regions receive a NotServingRegionException. 
> hbck reports 0 inconsistencies. 
> Moving the region fails. 
> Restarting the region server fixes the problem.
> We have see multiple occurrences which require operation intervention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message