hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-2629) Piggyback basic "alarm" framework on RS heartbeats
Date Sun, 30 May 2010 02:09:36 GMT
Piggyback basic "alarm" framework on RS heartbeats

                 Key: HBASE-2629
                 URL: https://issues.apache.org/jira/browse/HBASE-2629
             Project: HBase
          Issue Type: New Feature
          Components: master, regionserver
            Reporter: Todd Lipcon

There are a number of system conditions that can cause HBase to perform badly or have stability
issues. For example, significant swapping activity or overloaded ZK will result in all kinds
of problems.

It would be nice to put a very lightweight "alarm" framework in place, so that when the RS
notices something is amiss, it can raise an alarm flag for some period of time. These could
be exposed by JMX to external monitoring tools, and also displayed on the master web UI.

Some example alarms:
- "ZK read took >1000ms"
- "Long garbage collection pause detected"
- "Writes blocked on region for longer than 5 seconds"
etc etc

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message