hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-20087) Periodically attempt redeploy of regions in FAILED_OPEN state
Date Mon, 26 Feb 2018 18:28:01 GMT
Andrew Purtell created HBASE-20087:
--------------------------------------

             Summary: Periodically attempt redeploy of regions in FAILED_OPEN state
                 Key: HBASE-20087
                 URL: https://issues.apache.org/jira/browse/HBASE-20087
             Project: HBase
          Issue Type: Improvement
          Components: master, Region Assignment
            Reporter: Andrew Purtell
            Assignee: Andrew Purtell
             Fix For: 2.0.0, 1.5.0


Because RSGroups can cause permanent RIT with regions in FAILED_OPEN state, we added logic
to the master portion of the RSGroups extention to enumerate RITs and retry assignment of
regions in FAILED_OPEN state.

However, this strategy can be applied generally to reduce need of operator involvement in
cluster operations. Now an operator has to manually resolve FAILED_OPEN assignments but there
is little risk in automatically retrying them after a while. If the reason the assignment
failed has not cleared, the assignment will just fail again. Should the reason the assignment
failed be resolved, then operators don't have to do more in order for the cluster to fully
heal. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message