hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11747) ClusterStatus is too bulky
Date Wed, 01 Jul 2015 23:42:05 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611189#comment-14611189

stack commented on HBASE-11747:

bq. JMX isn't a transport for messages, is it?

No. Generally JMX is for management. HBase uses it to publish server attributes and metrics.
HBase also puts up a JMX Bean Server so you can query the beans over the net. This mechanism
uses java's crazy RMI which is mostly unusable by systems other than java and even then, has
a ping-pong random port mechanism that requires open port ranges. The nice thing about the
https://jolokia.org/ is that it REST/JSON-ifies our JMX making it more palatable to more systems.

bq. Could you describe JMX approach?

ClusterStatus is made of various attributes including ServerLoad for every node in the cluster.
ServerLoad is not actually server load. Rather, it is a dumping ground for all and sundry
including server attributes, configuration, and metrics. Redoing ServerLoad so it is just
load vitals would be a nice to have so we don't flood the master once a second as all report
in with fat messages on their heartbeats. Server metrics are also available published out
of our metrics system. Metrics are published variously -- as text in a servlet and as jmx
beans available on each server (jmx is on a period IIRC, servlet is poll). That we are dumping
out our metrics on a period via JMX and that we then go and collect them all again to put
on a heartbeat is silly. Would be nice to refactor. If ServerLoad is slimmed, then it would
help here given we do one up for each server and insert in ClusterStatus.

That was high-level what I was thinking. Separate issue I'd say, a background consideration
when addressing this one.

bq. Question - how would this new RPC overlap with metrics functionality?

Was thinking they'd be distinct. If you want metrics, use our metrics system; we are publishing
our metrics per server anyways.

> ClusterStatus is too bulky 
> ---------------------------
>                 Key: HBASE-11747
>                 URL: https://issues.apache.org/jira/browse/HBASE-11747
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Virag Kothari
>         Attachments: exceptiontrace
> Following exception on 0.98 with 1M regions on cluster with 160 region servers
> {code}
> Caused by: java.io.IOException: Call to regionserverhost:port failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.  May be
malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
> 	at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1482)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1454)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712)
> 	at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.getClusterStatus(MasterProtos.java:42555)
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.getClusterStatus(HConnectionManager.java:2132)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2166)
> 	at org.apache.hadoop.hbase.client.HBaseAdmin$16.call(HBaseAdmin.java:2162)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
> 	... 43 more
> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too
large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
> 	at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> {code}

This message was sent by Atlassian JIRA

View raw message