hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Isaacson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2554) Add separate metrics for missing blocks with desired replication level 1
Date Fri, 27 Jul 2012 22:38:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424196#comment-13424196
] 

Andy Isaacson commented on HDFS-2554:
-------------------------------------

I'm planning to implement 4 new metrics.  To clarify the meanings, I'll introduce some terminology.
 *r* is the desired replication count. *n* is the actual replication count of good blocks.
*c* is the number of corrupt replicas.  Each metric is the count of the number of blocks known
to be in the corresponding state.

# {{MissingBlocksR1}}: r=1, n=0, c=0
# {{CorruptBlocksR1}}: r=1, n=0, c=1
# {{CorruptBlocksRN}}: r>1, n=0, c>0
# {{MissingBlocksRN}}: r>1, n=0, c=0

These four values add up to the semantics of the existing {{MissingBlocks}} metric, unless
I'm missing something.

Each of these classes of blocks has important semantic differences in terms of recommended
administrator action to respond.

Suggestions for better names would be gratefully accepted.

On a related note the existing {{CorruptBlocks}} metric is misnamed; it's actually the number
of corrupt replicas, which is also interesting but a separate issue.
                
> Add separate metrics for missing blocks with desired replication level 1
> ------------------------------------------------------------------------
>
>                 Key: HDFS-2554
>                 URL: https://issues.apache.org/jira/browse/HDFS-2554
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Andy Isaacson
>            Priority: Minor
>
> Some users use replication level set to 1 for datasets which are unimportant and can
be lost with no worry (eg the output of terasort tests). But other data on the cluster is
important and should not be lost. It would be useful to separate the metric for missing blocks
by the desired replication level of those blocks, so that one could ignore missing blocks
at repl 1 while still alerting on missing blocks with higher desired replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message