hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10436) hbase 0.96+ jmx does not have regionserver info any more.
Date Wed, 29 Jan 2014 20:02:10 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13885730#comment-13885730
] 

Jonathan Hsieh commented on HBASE-10436:
----------------------------------------

bq. Why would you want some of the metrics to be accessible though the hadoop metrics system
and not others ?   People are using the hadoop sinks to get this information. I'm -1 on not
being consistent.

In an offline conversation, you mentioned that the hadoop/hbase metrics should only have numerical
values (aka, only numerical values are metrics).  In HBase, this is an undocumented invariant
-- it is not in the jira discussion around the HBASE-4050 new metrics umbrella; nor HBASE-6411,
the patch that acutally change behavior; nor from dev list discussion around 10/jul/2012 [1]
when this went in.  

I buy the general argument about metrics -- the old master's regionservers data is presented
by jmx but isn't usable by other metrics consumers like ganglia as implemented in 0.94.  
This is because they were organized with non-numerical information.  That part of the newer
metrics regime makes sense.

By not including non-numerical information like the list of live servers and dead servers
in the metrics, the approach I'm taking is consistent with the rule above.  Since your contention
there was this did not only contain numerical data (this info is not metrics), I chose an
approach to restore this info bypassing the metrics system and only plugging them directly
into the jmx management interface.  This seems consistent to me.

These lists are completely valid to have in a management interface like JMX.  (if you look
you'll see java classpaths and command line args lists in other jmx beans).  Moreover, this
info was presented in the past by the JMX management interface and still is a functional regression.
 

To further solidify the JMX mechanism, I can do another version of the next patch that add
unit tests to codify JMX interface names and make it more obvious when compatibility is being
broken.

bq. A ton of work was done to make the metrics regular and all exposed through the new hadoop
metrics2 system.

And we love you for it. :)

bq.  The hadoop metrics system will expose all of this through jmx and will also expose it
to the sinks plugins.  In my opinion if this functionality was really wanted (and I'm not
convinced that it is) it should be as a single source that isn't registered at all unless
a configuration is set. And we should deprecate exposing anything but master metrics through
the master.

Metrics are exposed via JMX but other things and can be exposed via JMX as well. These lists
are in the "other things" category.

I can definitely tell you that "this functionality was really wanted" -- I have a support
customer who upgraded to 0.96.1.1 and is asking for this feature back. :) 

I'm not sure I'm parsing the last two sentences correctly -- are you saying if I added a conf
setting to optionally make the new bean with restored data not show up you'd be ok?

bq. By your definitions the log messages emitted from log4j can't be changed if monitoring
system is tailing the logs. I think your criteria are way too loose. Monitoring a service
by is less important than using a service and so the metrics and things used by monitoring
should have a lower bar.

Logs are meant to be consumed directly by humans, and there is an understanding that they
will change between versions.  Logs, like the the web ui, are not interfaces that are meant
to be consumed by machines.  JMX, like a rest interface or code api, is a managment interface
and meant to be consumed by machines.  Thus we should ideally maintain compatibility and 
feature parity.

bq. But then the unused beans would either need to be removed or worse would emit wrong metrics.
Hence anyone monitoring the beans will need to change their monitoring code. It's way too
onerous to expect that everything that exposes a metric must always expose that metric.

The unused bean shouldn't be present.  I haven't dug into the hadoop metrics part in depth
yet but i would think that if you chose different implementations of hlogs or memstores certain
beans would show up and others would not.  To change this today we'd need to restart the server.
 (e.g. we have a WAL metric bean, but maybe we'd have a multiwal bean -- which would potentially
need a RS restart to activate).

bq. Every RegionServer already exposes all of this information.

How do I get the list of regionservers?  That list should at the least be exposed so that
a program that points to the master's jmx can go to the regionservers. no?  Its sort of the
point of the management interface to provide this info.

Is the contention that the rs list also contains metrics and other numerical things?  Would
removing the numbers be sufficient? (just having the list of regionservers and regions)

[1] http://mail-archives.apache.org/mod_mbox/hbase-dev/201207.mbox/thread

> hbase 0.96+ jmx does not have regionserver info any more.
> ---------------------------------------------------------
>
>                 Key: HBASE-10436
>                 URL: https://issues.apache.org/jira/browse/HBASE-10436
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0, 0.96.0, 0.99.0
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>            Priority: Critical
>         Attachments: hbase-10436.patch, hbase-10436.v2.patch
>
>
> HBase 0.96's refactored jmx beans do not contain the master's list of dead region servers
and live regionservers with load info.  HBase 0.94 did (though in a single monolithic blob).
 
> This JMX interface should be considered as much of an API as the the normal wire or java
api.  Dropping values from this was done without deprecation and the removal of this information
is a functional regression.
> We should provide the information in the 0.96+ JMX.  HBase 0.94 had a  monolithic JMX
blob ("hadoop:service=Master,name=Master")  that contained a lot of information, including
the regionserver list and the cached regionserver load for each region  found on the master
webpage.  0.96+ refactored jmx this into several jmx beans which could be selectively retrieved.
 These include:
> * hadoop:service=HBase,name=Master,sub=AssignmentManager
> * hadoop:service=HBase,name=Master,sub=Balancer
> * hadoop:service=HBase,name=Master,sub=Server
> * hadoop:service=HBase,name=Master,sub=FileSystem
> Specifically the (Hadoop:service=HBase,name=Master,sub=Server) listing that used to contain
regionservers and deadregionservers in jmx were replaced in   with numRegionServers and numDeadRegionservers
which only contain counts.  
> I propose just adding another mbean called "RegionServers" under the bean: "hadoop:service=HBase,name=Master,sub=RegionServers"



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message