hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring
Date Wed, 02 Mar 2016 11:41:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175475#comment-15175475
] 

Yu Li commented on HBASE-15160:
-------------------------------

Thanks for checking this [~enis]

bq. Yu Li did you see Elliott's review comment above. Using histograms that we use elsewhere
is the correct way to go.
Yes, already made the change in the latest patch. Please allow me to quote my previous description
of the latest patch:
{quote}
Update the patch to avoid missing spike. Changes include:
1. Instead of get and reset a single latency point, now we get all latencies of operations
happened during a collection interval and insert them into the histogram in one go
2. Since the histogram updating time differs from the latencies' recording time, we introduce
another gauge to show the average latency of the collection interval
3. With these two metrics, we could see the brief situation from latency gauge and check whether
there's any spike in this interval from histogram.
{quote}

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-15160
>                 URL: https://issues.apache.org/jira/browse/HBASE-15160
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0, 1.1.2
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, HBASE-15160_v3.patch
>
>
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, fsPreadLatency
and fsWriteLatency, have been removed. There was some discussion about putting them back in
a new JIRA but never happened. According to our experience, these metrics are useful to judge
whether issue lies on HDFS when slow request occurs, so we propose to put them back in this
JIRA, and add the metrics for monitoring as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message