hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-14012) Double Assignment and Dataloss when ServerCrashProcedure runs during Master failover
Date Thu, 02 Jul 2015 20:06:05 GMT
stack created HBASE-14012:
-----------------------------

             Summary: Double Assignment and Dataloss when ServerCrashProcedure runs during
Master failover
                 Key: HBASE-14012
                 URL: https://issues.apache.org/jira/browse/HBASE-14012
             Project: HBase
          Issue Type: Bug
          Components: master, Region Assignment
    Affects Versions: 2.0.0, 1.2.0
            Reporter: stack
            Assignee: stack
            Priority: Critical


ITBLL. Master comes up. It is joining a running cluster (all servers up except Master with
most regions assigned out on cluster). ProcedureStore has two ServerCrashProcedures unfinished
(RUNNABLE state). In SCP, we only check if failover in first step, not for every step, which
means ServerCrashProcedure will run if on reload it is beyond the first step.
{code}
    // Is master fully online? If not, yield. No processing of servers unless master is up
    if (!services.getAssignmentManager().isFailoverCleanupDone()) {
      throwProcedureYieldException("Waiting on master failover to complete");
    }
{code}

There is no definitive logging but it looks like we start running at the assign step. The
regions to assign were persisted before master crash. The regions to assign may not make sense
post crash: i.e. here we double-assign. Checking. We shouldn't run until master is fully up
regardless.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message