hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Esteban Gutierrez (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12131) [hbck] undeployRegions should handle gracefully network partitions and other exceptions to avoid the same region deployed multiple times
Date Wed, 31 Dec 2014 07:48:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262015#comment-14262015
] 

Esteban Gutierrez commented on HBASE-12131:
-------------------------------------------

Created HBASE-12793 to address the logging and handling of the IOE on HBaseFsckRepair.closeRegionSilentlyAndWait()

> [hbck] undeployRegions should handle gracefully network partitions and other exceptions
to avoid the same region deployed multiple times
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-12131
>                 URL: https://issues.apache.org/jira/browse/HBASE-12131
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck
>    Affects Versions: 0.94.23
>            Reporter: Esteban Gutierrez
>            Assignee: Esteban Gutierrez
>            Priority: Critical
>
> If we get an IOE (we currently ignore it) while regions are being undeployed by hbck
we should make sure that we don't re-assign that region in the master before we know that
RS was marked as dead and optionally let the user to confirm that action or we will end in
a split brain situation with clients talking to different RSs serving the same region.
> The offending part is here in HBaseFsck.undeployRegions():
> {code}
>  private void undeployRegions(HbckInfo hi) throws IOException, InterruptedException {
>     for (OnlineEntry rse : hi.deployedEntries) {
>       LOG.debug("Undeploy region "  + rse.hri + " from " + rse.hsa);
>       try {
>         HBaseFsckRepair.closeRegionSilentlyAndWait(admin, rse.hsa, rse.hri);
>         offline(rse.hri.getRegionName());
>       } catch (IOException ioe) {
>         LOG.warn("Got exception when attempting to offline region "
>             + Bytes.toString(rse.hri.getRegionName()), ioe);
>       }
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message