hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rubio (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-10684) Extend HA support for more use cases
Date Wed, 11 Jun 2014 21:53:03 GMT
Paul Rubio created HADOOP-10684:

             Summary: Extend HA support for more use cases
                 Key: HADOOP-10684
                 URL: https://issues.apache.org/jira/browse/HADOOP-10684
             Project: Hadoop Common
          Issue Type: Improvement
          Components: ha
            Reporter: Paul Rubio
            Priority: Minor

We'd like the current HA framework to be more configurable from a behavior standpoint.  In
- Add the ability for a HAServiceTarget to survive a configurable number of health check failures
(default of 0) before HealthMonitor (HM) reports service not responding or service unhealthy.
 For instance, predicate the HM on a state machine whose default implementation can be overridden
by method or constructor argument.  The default would behave the same as today.
-- If a target fails a health check but does not exceed the maximum number of consecutive
check failures, it’d be desirable if the target and/or controller were alerted.
--- i.e. Introduce a SERVICE_DYING state
--Additionally, it’d be desirable if a mechanism existed, similar to fencing semantics,
for “reviving” a service that transitioned to SERVICE_DYING.
--- i.e. attemptRevive(…)
- Add the ability to allow a service to completely fail (no failover or failback possible).
 There are scenarios where allowing a failover or failback could cause more damage.
-- E.g. a recovered master with stale data.  The master may have been manually recovered (human
- Add affinity to a particular HAServiceTarget.
-- In other words, allow the controller to prefer one target over another when deciding leadership.
-- If a higher affinity, but previously unhealthy target, becomes healthy then it should be
allowed to become the leader.
-- Likewise, if two targets are racing for a ZooKeeper lock, then the controller should "prefer"
the higher the affinity target.
-- It might make more sense to add a different implementation/subclass of the ZKFailoverController
(i.e. ZKAffinityFailoverController) than modify current behavior.

Please comment with thoughts/ideas/etc...

This message was sent by Atlassian JIRA

View raw message