hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Smith <psm...@aconex.com>
Subject Re: Should HTable.put() return a Future?
Date Wed, 07 Apr 2010 01:10:32 GMT
> 
> Parfait can poll JMX counters, or counters can be invoked direct.  I'm working on a MetricContext
that exports all HBase and Hadoop JMX counters into Parfait.  The goal is to be able to have
PCP visualize data more effectively for HBase/Hadoop clusters. To give an example of what
sort of visualization I'd love to have for HBase & Hadoop see a simple working pic of
3d visualisation  at [4] below, that's basic, but imagine a 3D vis of all the HBase region
servers showing visualizations of Hbase specific metrics, played back in real time, or retrospectively
at any pace you want.
> 

btw we also export all the JVM metrics here too, GC activity (rates, times spent, for both
major and minor GC's), class compilations, memory segment sizes (heap, perm gen, code area
etc).

if HBase metrics like compactions and splits etc were exported into PCP one could see the
impact across hardware (cpu, Virtual memory, disk) with JVM level stuff (heap sizes and GC)
correlating with HBase activity.  

Parfait can also collect metrics on a per-thread (ThreadLocal) to allow individual request
collection.  For example, right now in production we can see for every request (a Controller/Servlet)
this sort of data in our log files:

[2010-04-07 11:06:28,569 INFO ][EventMetricCollector][http-2001-Processor85 g7pfur4y][59.167.192.26][228349]
Top	ViewCorrespondenceControl	ViewCorrespondenceControl		Elapsed time: own 3113ms, total 3117ms
Total CPU: own 10ms, total 30ms	User CPU: own 10ms, total 20ms	System CPU: own 10ms, total
10ms	Blocked count: own 0, total 0	Blocked time: own 0ms, total 0ms	Wait count: own 0, total
0	Wait time: own 0ms, total 0ms	Database execution time: own 3050ms, total 3050ms	Database
execution count: own 12, total 12	total Database CPU time: own 0, total 0	Error Pages: own
0, total 1

I'd hope that through a similar mechanism I could instrument the HBase Scan costs of a particular
User activity and see how many rows were read over, and how many cell values were picked out
for a single request.  This allows us to narrow in quickly on which activity (controller action)
or which users are consuming the most of a certain resource, find out why and fix it.  We
import our PCP data into a datawarehouse for longer term capacity planning too.

anyway, some more ideas to kick around and discuss.

Paul


Mime
View raw message