hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14869) Better request latency histograms
Date Thu, 26 Nov 2015 05:46:11 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-14869:
----------------------------------
    Attachment: 14869-v1-0.98.txt

Here's a patch that I tested a bit. Reports the right values.
Changes:
* Renamed "bands" to "ranges".
* Does not report values for range for it hasn't seen a value. That allows us to use this
for operations that take very short (few ms) or very long (many minutes) times, without reporting
time ranges that make no sense for the operation.

Question: How should I name these metrics? Presumably they'd be processed mostly by software,
and I have to give them some name.
I chose: "metricname"_start-end, i.e."Get_0-1", "Get_1-3", "Get_10-30" , etc, and "Get_>600000".
Any better ideas?


> Better request latency histograms
> ---------------------------------
>
>                 Key: HBASE-14869
>                 URL: https://issues.apache.org/jira/browse/HBASE-14869
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>         Attachments: 14869-test-0.98.txt, 14869-v1-0.98.txt
>
>
> I just discussed this with a colleague.
> The get, put, etc, histograms that each region server keeps are somewhat useless (depending
on what you want to achieve of course), as they are aggregated and calculated by each region
server.
> It would be better to record the number of requests in certainly latency bands in addition
to what we do now.
> For example the number of gets that took 0-5ms, 6-10ms, 10-20ms, 20-50ms, 50-100ms, 100-1000ms,
> 1000ms, etc. (just as an example, should be configurable).
> That way we can do further calculations after the fact, and answer questions like: How
often did we miss our SLA? Percentage of requests that missed an SLA, etc.
> Comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message