hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15160) Put back HFile's HDFS op latency sampling code and add metrics for monitoring
Date Tue, 30 May 2017 20:11:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030045#comment-16030045

Enis Soztutar commented on HBASE-15160:

bq. Previously the concern on readAtOffset completely make sense, but HBASE-17917 has removed
the stream lock so no more stream read when pread is true, which makes it possible to move
the updating of the metrics up to the caller (smile).
Agreed that with the stream lock gone, we can always know when it was a pread and when it
was not from the caller. However, why do you think that updating the metrics should be pulled
up the stack? Since there are no other synchronization points, they are equal in terms of
cost. The reason I wanted to be pushed down the stack is that in some cases (for example checksum
failure) we are doing two reads transparently to the caller. The metrics pulled up the stack
will be incorrect slightly when things like this happens. Also I want to backport this to
branch-1, so keeping the metrics update here should give us better portability of future patches.

> Put back HFile's HDFS op latency sampling code and add metrics for monitoring
> -----------------------------------------------------------------------------
>                 Key: HBASE-15160
>                 URL: https://issues.apache.org/jira/browse/HBASE-15160
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0, 1.1.2
>            Reporter: Yu Li
>            Assignee: Yu Li
>            Priority: Critical
>         Attachments: HBASE-15160.patch, HBASE-15160_v2.patch, HBASE-15160_v3.patch, hbase-15160_v4.patch,
hbase-15160_v5.patch, hbase-15160_v6.patch
> In HBASE-11586 all HDFS op latency sampling code, including fsReadLatency, fsPreadLatency
and fsWriteLatency, have been removed. There was some discussion about putting them back in
a new JIRA but never happened. According to our experience, these metrics are useful to judge
whether issue lies on HDFS when slow request occurs, so we propose to put them back in this
JIRA, and add the metrics for monitoring as well.

This message was sent by Atlassian JIRA

View raw message