hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Carroll (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4393) Implement a canary monitoring program
Date Mon, 04 Jun 2012 22:33:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288967#comment-13288967
] 

Jeremy Carroll commented on HBASE-4393:
---------------------------------------

Just wanted to put in a few operational comments. We have a version of this Canary script
hooked up to our current HBase cluster for monitoring. It works well to determine if your
cluster is responding to RPC's in a health amount of time. But it does not work well to determine
latency for requests overall as the getStartKey becomes cached. Since a request for the same
key over, and over again is basically cache warming it returns in <1ms every time after
a few iterations.

We played around with the idea of using a random request within the RegionServer to get non-cache
latency responses. In this scenario we basically are testing our disk latency. IMHO the intention
of the Canary is not to test my disk response but the overall response / health of the HBase
RegionServer. We took an approach to use the fsLatency histogram metrics (99, 999th percent)
in a separate check in addition to the Canary for overall health status.
                
> Implement a canary monitoring program
> -------------------------------------
>
>                 Key: HBASE-4393
>                 URL: https://issues.apache.org/jira/browse/HBASE-4393
>             Project: HBase
>          Issue Type: New Feature
>          Components: monitoring
>    Affects Versions: 0.92.0
>            Reporter: Todd Lipcon
>            Assignee: Matteo Bertozzi
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: Canary-v0.java, HBASE-4393-v0.patch, HBaseCanary.java
>
>
> This JIRA is to implement a standalone program that can be used to do "canary monitoring"
of a running HBase cluster. This program would gather a list of the regions in the cluster,
then iterate over them doing lightweight operations (eg short scans) to provide metrics about
latency as well as alert on availability issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message