hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-11789) Maintain Short-Circuit Read Statistics
Date Tue, 13 Jun 2017 23:36:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048517#comment-16048517
] 

Arpit Agarwal edited comment on HDFS-11789 at 6/13/17 11:35 PM:
----------------------------------------------------------------

Thanks for updating the patch [~hanishakoneru]. A few comments:
# Typos in {{SHORT_CIRCUIT_READ_LATECNY_METRIC_REGISTERD_NAME}}.
# The test-only method {{getShortCircuitReadRollingAverages}} should be tagged with {{\@VisibleForTesting}}.
Also it should be package-private if possible.
# I think we can eliminate the METRICS_SAMPLING_PERCENTAGE_KEY config key. We can enable the
metric when the sampling percentage is > 0.
# Let's avoid static initialization of {{BlockReaderLocal#metrics}}. One issue is that adding
the overhead of a MutableRollingAverages object (and its rates roller thread) for all clients,
most of which may will never enable SCR statistics. I think your original approach of initializing
it within a lock on construction was fine, the overhead of that lock compared to the overhead
of cloning a file descriptor via system calls should be minimal.

Still reviewing the tests.


was (Author: arpitagarwal):
Thanks for updating the patch [~hanishakoneru]. A few comments:
# Typos in {{SHORT_CIRCUIT_READ_LATECNY_METRIC_REGISTERD_NAME}}.
# The test-only method {{getShortCircuitReadRollingAverages}} should be tagged with {{\@VisibleForTesting}}.
# I think we can eliminate the METRICS_SAMPLING_PERCENTAGE_KEY config key. We can enable the
metric when the sampling percentage is > 0.
# Let's avoid static initialization of {{BlockReaderLocal#metrics}}. One issue is that adding
the overhead of a MutableRollingAverages object (and its rates roller thread) for all clients,
most of which may will never enable SCR statistics. I think your original approach of initializing
it within a lock on construction was fine, the overhead of that lock compared to the overhead
of cloning a file descriptor via system calls should be minimal.

Still reviewing the tests.

> Maintain Short-Circuit Read Statistics
> --------------------------------------
>
>                 Key: HDFS-11789
>                 URL: https://issues.apache.org/jira/browse/HDFS-11789
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Hanisha Koneru
>            Assignee: Hanisha Koneru
>         Attachments: HDFS-11789.001.patch, HDFS-11789.002.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests can stall
indefinitely while reading from the file descriptor. Currently there is no way to detect when
short-circuit read requests are slow or blocked. 
> This Jira proposes that each BlockReaderLocal maintain read statistics while it is active
by measuring the time taken for a pre-determined fraction of read requests. These per-reader
stats can be aggregated into global stats when the reader is closed. The aggregate statistics
can be exposed via JMX.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message