hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6682) Add a metric to expose the timestamp of the oldest under-replicated block
Date Fri, 31 Jul 2015 21:30:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14649876#comment-14649876

Allen Wittenauer commented on HDFS-6682:

We have no insight into how old a given replication might have been hanging around so no way
to really answer that question.  We know it gets backed up during cascading DN failure events
(thanks very slow NM memory checker+fast acting bad job+Linux OOM killer!), so I was always
under the impression that it's just the whole queue is super busy vs. old ones never cleared.
 Rate might be useful to at least tell us if it is stuck and/or a project on how long the
queue will remain behind.

> Add a metric to expose the timestamp of the oldest under-replicated block
> -------------------------------------------------------------------------
>                 Key: HDFS-6682
>                 URL: https://issues.apache.org/jira/browse/HDFS-6682
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Akira AJISAKA
>            Assignee: Akira AJISAKA
>              Labels: metrics
>         Attachments: HDFS-6682.002.patch, HDFS-6682.003.patch, HDFS-6682.004.patch, HDFS-6682.005.patch,
HDFS-6682.006.patch, HDFS-6682.patch
> In the following case, the data in the HDFS is lost and a client needs to put the same
file again.
> # A Client puts a file to HDFS
> # A DataNode crashes before replicating a block of the file to other DataNodes
> I propose a metric to expose the timestamp of the oldest under-replicated/corrupt block.
That way client can know what file to retain for the re-try.

This message was sent by Atlassian JIRA

View raw message