hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring
Date Fri, 02 Jun 2017 18:39:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035209#comment-16035209

Enis Soztutar commented on HBASE-15160:

Good finding. The histogram is supposed to adjust itself everytime {{snapshotAndReset()}}
is called which is every collection interval (10 secs for example). Maybe this mechanism is
not working as well as it should. Anyways, we can pursue further in HBASE-18151. 

Using millis is fine for now until HBASE-18151 is fixed. We typically use nanos in newer code
bases to have more granularity, but in this specific instance it is better to have millis
metrics than no metrics at all. Let me commmit v7 patch with a small renaming of latencyNanos
to latencyMillis. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -----------------------------------------------------------------------------
>                 Key: HBASE-15160
>                 URL: https://issues.apache.org/jira/browse/HBASE-15160
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0, 1.1.2
>            Reporter: Yu Li
>            Assignee: Yu Li
>            Priority: Critical
>         Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, HBASE-15160_v3.patch, hbase-15160_v4.patch,
hbase-15160_v5.patch, hbase-15160_v6.patch, hbase-15160_v7.patch
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, fsPreadLatency
and fsWriteLatency, have been removed. There was some discussion about putting them back in
a new JIRA but never happened. According to our experience, these metrics are useful to judge
whether issue lies on HDFS when slow request occurs, so we propose to put them back in this
JIRA, and add the metrics for monitoring as well.

This message was sent by Atlassian JIRA

View raw message