hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-21147) (1.4) Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes
Date Wed, 05 Sep 2018 00:51:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-21147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Josh Elser resolved HBASE-21147.
      Resolution: Fixed
    Hadoop Flags: Reviewed

> (1.4) Add ability for HBase Canary to ignore a configurable number of ZooKeeper down
> ------------------------------------------------------------------------------------------
>                 Key: HBASE-21147
>                 URL: https://issues.apache.org/jira/browse/HBASE-21147
>             Project: HBase
>          Issue Type: Improvement
>          Components: canary, Zookeeper
>    Affects Versions: 1.0.0, 3.0.0, 2.0.0
>            Reporter: David Manning
>            Assignee: David Manning
>            Priority: Minor
>             Fix For: 1.4.8
>         Attachments: HBASE-21126.branch-1.001.patch, HBASE-21126.master.001.patch, HBASE-21126.master.002.patch,
HBASE-21126.master.003.patch, zookeeperCanaryLocalTestValidation.txt
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> When running org.apache.hadoop.hbase.tool.Canary with args -zookeeper -treatFailureAsError,
the Canary will try to get a znode from each ZooKeeper server in the ensemble. If any server
is unavailable or unresponsive, the canary will exit with a failure code.
> If we use the Canary to gauge server health, and alert accordingly, this can be too strict.
For example, in a 5-node ZooKeeper cluster, having one node down is safe and expected in rolling
> This is a request to allow the Canary to take another parameter
> {code:java}
> -permittedZookeeperFailures <N>{code}
> If N=1, in the 5-node ZooKeeper ensemble example, then the Canary will still pass if
4 ZooKeeper nodes are reachable, but fail if 3 or fewer are reachable.
> (This is my first Jira posting... sorry if I messed anything up.)

This message was sent by Atlassian JIRA

View raw message