hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12911) Client-side metrics
Date Tue, 08 Sep 2015 17:09:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735178#comment-14735178
] 

Nick Dimiduk commented on HBASE-12911:
--------------------------------------

bq. Hard to tell what connection is hosted where and where it is connected. Is that fixable?

I was considering logging the connection creation stack trace as a tag on the bean, similar
to [~chenheng]'s investigation on HBASE-14361. Dunno if that's helpful to commit though.

Okay, we'll keep connection-level aggregation.

bq. If slow query though, how you find it? 

I was imagining one could set alerting based on the aggregate 95pct latency. Ie, if the 95pct
of RPC's to any individual server trend drastically higher than the aggregate, I'd want to
know about it. [~phobos182], [~toffer], [~clayb], [~eclark] is this crazy thinking?

Do we like the individual connection objects exposed separately? I was thinking of applications
like the (I think it was [~malaskat]'s) multi-cluster client-side failover patch, where you'd
be embedding multiple connection instances in an application and want to see their behaviors
separately. Hence Connection objects reported by their objectId (I assume this is stable in
Java?). Maybe this is an uncommon case and supporting it makes this feature harder to consume
for everyone else? Those objectIds change all the time, so parsing them, for instance, for
an OpenTSDB tcollector maybe be annoying.

Then again, we don't provide a /jmx over http from the clients, so there's not an easy way
for tcollector to grab these as they are, unless it supports raw jmx too.

> Client-side metrics
> -------------------
>
>                 Key: HBASE-12911
>                 URL: https://issues.apache.org/jira/browse/HBASE-12911
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client, Operability, Performance
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 0001-HBASE-12911-Client-side-metrics.patch,
0001-HBASE-12911-Client-side-metrics.patch, 0001-HBASE-12911-Client-side-metrics.patch, am.jpg,
client metrics RS-Master.jpg, client metrics client.jpg, conn_agg.jpg, connection attributes.jpg,
ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add some kind
of metrics collection end up wrapping Table method invocations with {{System.currentTimeMillis()}}.
For a crude example of this, have a look at what I did in {{PerformanceEvaluation}} for exposing
requests latencies up to {{IntegrationTestRegionReplicaPerf}}. The client is quite complex,
there's a lot going on under the hood that is impossible to see right now without a profiler.
Being a crucial part of the performance of this distributed system, we should have deeper
visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice because the
client is often embedded as a library in a user's application. We should have integration
with our metrics tools so that, i.e., a client embedded in a coprocessor can report metrics
through the usual RS channels, or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out of the
box we'd include a hadoop-metrics implementation and one other, possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message