hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17704) Regions stuck in FAILED_OPEN when HDFS blocks are missing
Date Wed, 01 Mar 2017 23:33:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891281#comment-15891281

Gary Helmling commented on HBASE-17704:

So HBASE-16209 added a backoff policy for retries of region open, without which regions would
go into FAILED_OPEN quickly.  So maybe all that's needed is bump up the configuration for
maximum attempts ("hbase.assignment.maximum.attempts") to Integer.MAX_VALUE?

> Regions stuck in FAILED_OPEN when HDFS blocks are missing
> ---------------------------------------------------------
>                 Key: HBASE-17704
>                 URL: https://issues.apache.org/jira/browse/HBASE-17704
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 1.1.8
>            Reporter: Mathias Herberts
> We recently experienced the loss of a whole rack (6 DNs + RS) in a 120 node cluster.
This lead to the regions which were present on the 6 RS which became unavailable to be reassigned
to live RSs. When attempting to open some of the reassigned regions, some RS encountered missing
blocks and issued "No live nodes contain current block Block locations" putting the regions
in state FAILED_OPEN.
> Once the disappeared DNs went back online, the regions were left in FAILED_OPEN, needing
a restart of all the affected RSs to solve the problem.

This message was sent by Atlassian JIRA

View raw message