hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From apurt...@apache.org
Subject git commit: HBASE-12362 Interim documentation of important master and regionserver metrics
Date Wed, 05 Nov 2014 18:11:10 GMT
Repository: hbase
Updated Branches:
  refs/heads/master e1b82fe91 -> d64ade4fd


HBASE-12362 Interim documentation of important master and regionserver metrics


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/d64ade4f
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/d64ade4f
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/d64ade4f

Branch: refs/heads/master
Commit: d64ade4fde2bb99a290ca4176d63a8ac444231c9
Parents: e1b82fe
Author: Andrew Purtell <apurtell@apache.org>
Authored: Wed Nov 5 10:09:28 2014 -0800
Committer: Andrew Purtell <apurtell@apache.org>
Committed: Wed Nov 5 10:09:28 2014 -0800

----------------------------------------------------------------------
 src/main/docbkx/ops_mgt.xml | 159 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 153 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/d64ade4f/src/main/docbkx/ops_mgt.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/ops_mgt.xml b/src/main/docbkx/ops_mgt.xml
index cd6562f..81be6c7 100644
--- a/src/main/docbkx/ops_mgt.xml
+++ b/src/main/docbkx/ops_mgt.xml
@@ -1122,17 +1122,164 @@ $ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh
--restart --
         </listitem>
       </itemizedlist>
     </section>
+    <section xml:id="master_metrics">
+      <title>Most Important Master Metrics</title>
+      <para>Note: Counts are usually over the last metrics reporting interval.</para>
+      <variablelist>
+        <varlistentry>
+          <term>hbase.master.numRegionServers</term>
+          <listitem><para>Number of live regionservers</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.master.numDeadRegionServers</term>
+          <listitem><para>Number of dead regionservers</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.master.ritCount </term>
+          <listitem><para>The number of regions in transition</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.master.ritCountOverThreshold</term>
+          <listitem><para>The number of regions that have been in transition
longer than
+            a threshold time (default: 60 seconds)</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.master.ritOldestAge</term>
+          <listitem><para>The age of the longest region in transition, in milliseconds
+            </para></listitem>
+        </varlistentry>
+      </variablelist>
+    </section>
     <section xml:id="rs_metrics">
       <title>Most Important RegionServer Metrics</title>
-      <para>Previously, this section contained a list of the most important RegionServer
metrics.
-        However, the list was extremely out of date. In some cases, the name of a given metric
has
-        changed. In other cases, the metric seems to no longer be exposed. An effort is underway
to
-        create automatic documentation for each metric based upon information pulled from
its
-        implementation.</para>
+      <para>Note: Counts are usually over the last metrics reporting interval.</para>
+      <variablelist>
+        <varlistentry>
+          <term>hbase.regionserver.regionCount</term>
+          <listitem><para>The number of regions hosted by the regionserver</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.storeFileCount</term>
+          <listitem><para>The number of store files on disk currently managed
by the
+            regionserver</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.storeFileSize</term>
+          <listitem><para>Aggregate size of the store files on disk</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.hlogFileCount</term>
+          <listitem><para>The number of write ahead logs not yet archived</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.totalRequestCount</term>
+          <listitem><para>The total number of requests received</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.readRequestCount</term>
+          <listitem><para>The number of read requests received</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.writeRequestCount</term>
+          <listitem><para>The number of write requests received</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.numOpenConnections</term>
+          <listitem><para>The number of open connections at the RPC layer</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.numActiveHandler</term>
+          <listitem><para>The number of RPC handlers actively servicing requests</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.numCallsInGeneralQueue</term>
+          <listitem><para>The number of currently enqueued user requests</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.numCallsInReplicationQueue</term>
+          <listitem><para>The number of currently enqueued operations received
from
+            replication</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.numCallsInPriorityQueue</term>
+          <listitem><para>The number of currently enqueued priority (internal
housekeeping)
+            requests</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.flushQueueLength</term>
+          <listitem><para>Current depth of the memstore flush queue. If increasing,
we are falling
+            behind with clearing memstores out to HDFS.</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.updatesBlockedTime</term>
+          <listitem><para>Number of milliseconds updates have been blocked so
the memstore can be
+            flushed</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.compactionQueueLength</term>
+          <listitem><para>Current depth of the compaction request queue. If increasing,
we are
+            falling behind with storefile compaction.</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.blockCacheHitCount</term>
+          <listitem><para>The number of block cache hits</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.blockCacheMissCount</term>
+          <listitem><para>The number of block cache misses</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.blockCacheExpressHitPercent </term>
+          <listitem><para>The percent of the time that requests with the cache
turned on hit the
+            cache</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.percentFilesLocal</term>
+          <listitem><para>Percent of store file data that can be read from the
local DataNode,
+            0-100</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.&lt;op&gt;_&lt;measure&gt;</term>
+          <listitem><para>Operation latencies, where &lt;op&gt; is one
of Append, Delete, Mutate,
+            Get, Replay, Increment; and where &lt;measure&gt; is one of min, max,
mean, median,
+            75th_percentile, 95th_percentile, 99th_percentile</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.slow&lt;op&gt;Count </term>
+          <listitem><para>The number of operations we thought were slow, where
&lt;op&gt; is one
+            of the list above</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.GcTimeMillis</term>
+          <listitem><para>Time spent in garbage collection, in milliseconds</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.GcTimeMillisParNew</term>
+          <listitem><para>Time spent in garbage collection of the young generation,
in
+            milliseconds</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.GcTimeMillisConcurrentMarkSweep</term>
+          <listitem><para>Time spent in garbage collection of the old generation,
in
+            milliseconds</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.authenticationSuccesses</term>
+          <listitem><para>Number of client connections where authentication succeeded</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.authenticationFailures</term>
+          <listitem><para>Number of client connection authentication failures</para></listitem>
+        </varlistentry>
+        <varlistentry>
+          <term>hbase.regionserver.mutationsWithoutWALCount </term>
+          <listitem><para>Count of writes submitted with a flag indicating they
should bypass the
+            write ahead log</para></listitem>
+        </varlistentry>
+      </variablelist>
     </section>
   </section>      
 
-
   <section
     xml:id="ops.monitoring">
     <title>HBase Monitoring</title>


Mime
View raw message