hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Isaacson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2554) Add separate metrics for missing blocks with desired replication level 1
Date Mon, 20 Aug 2012 21:09:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438209#comment-13438209
] 

Andy Isaacson commented on HDFS-2554:
-------------------------------------

bq. Yea, it's a bummer to do that much computation with the lock held. Have you looked at
alternatives like keeping the stats in the CorruptReplicasMap?

I'll take another look at updating the metrics on the fly.

bq. Seems more logical if the interface is getMissingBlocks (all blocks), and getMissingBlocksWithRepl1
(the repl=1 count) and people who want the delta subtract (ditto with the metrics names, "MissingBlocks"
and "MissingBlocksRepl1" instead of "R1" and "R2N"

Maybe I'm missing something, but this seems to be the same change you suggested in a comment
above dated 07/Aug/12.  I responded to it above, it seems much more natural to me to provide
values A and B which add to give C than to provide A and C which subtracted give B.

Relatedly, the administrative action recommended to deal with missing/corrupt blocks are linked
to the replication count.  "Dear admin, you have unreplicated files with missing blocks, might
want to delete them" and "Dear admin, you have replicated files with missing blocks, please
bring some DNs back online to allow file recovery".

bq. Nit: either pull out the metrics comment change to HDFS-3815 or update the javadoc comment
in this change to match
Yep, thanks, I'll update the javadoc too.

bq. Needs a test

Inbound.
                
> Add separate metrics for missing blocks with desired replication level 1
> ------------------------------------------------------------------------
>
>                 Key: HDFS-2554
>                 URL: https://issues.apache.org/jira/browse/HDFS-2554
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andy Isaacson
>            Priority: Minor
>         Attachments: hdfs-2554-1.txt, hdfs-2554.txt
>
>
> Some users use replication level set to 1 for datasets which are unimportant and can
be lost with no worry (eg the output of terasort tests). But other data on the cluster is
important and should not be lost. It would be useful to separate the metric for missing blocks
by the desired replication level of those blocks, so that one could ignore missing blocks
at repl 1 while still alerting on missing blocks with higher desired replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message