hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring
Date Fri, 26 May 2017 00:24:04 GMT

     [ https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Enis Soztutar updated HBASE-15160:
    Attachment: hbase-15160_v6.patch

[~carp84] how about this patch.  
I've removed the extra counters and made it so that we are passing a boolean down from the
getMetaBlock() function so that metrics are not updated for the meta blocks.  
The reason that we cannot move the timing and updating of metrics up the stack is that, the
callers of readBlock() do not know whether the returned block is read from disk, or comes
from cache. Is it easy enough for you to replicate the YCSB tests? I've done some basic testing,
and did not find meaningful perf regression. 

BTW, these metrics would have saved us days worth of debugging in a recent case, so let's
get this patch in one way or the other. 

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -----------------------------------------------------------------------------
>                 Key: HBASE-15160
>                 URL: https://issues.apache.org/jira/browse/HBASE-15160
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0, 1.1.2
>            Reporter: Yu Li
>            Assignee: Yu Li
>         Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, HBASE-15160_v3.patch, hbase-15160_v4.patch,
hbase-15160_v5.patch, hbase-15160_v6.patch
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, fsPreadLatency
and fsWriteLatency, have been removed. There was some discussion about putting them back in
a new JIRA but never happened. According to our experience, these metrics are useful to judge
whether issue lies on HDFS when slow request occurs, so we propose to put them back in this
JIRA, and add the metrics for monitoring as well.

This message was sent by Atlassian JIRA

View raw message