hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete
Date Wed, 04 Aug 2010 01:12:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895117#action_12895117
] 

Konstantin Shvachko commented on HDFS-1111:
-------------------------------------------

The patch looks good. A couple of nits.

# I don't think we should have new a configuration variable for the number of corrupt blocks
to return. 
We should just use the constant you introduced. I like the name {{MAX_CORRUPT_FILE_BLOCKS_RETURNED}}.
The value is set to 100, which is fine with me. Please speak up if somebody has other values
in mind.
So variable {{maxListCorruptFilesBlocksReturned}} will not be necessary.
# Consulted with Rob about the name {{listCorruptFilesBlocks()}}. Plural Files in the name
doesn't sound right, 
as it is not clear whether we return files or blocks. Would be good to change it to {{listCorruptFileBlocks()}}
throughout the code.
# In {{NamenodeFsck.listCorruptFilesBlocks()}} the printout is not correct. It will read one
of the following:
{code}
"The filesystem under path '/tmp' has 57 is CORRUPT files"
"The filesystem under path '/tmp' has no is CORRUPT files"
{code}
This should be rephrased. Also the last message can be confusing if you already returned some
files before. 
We should probably distinguish and say "has no CORRUPT files" if startBlockAfter == null,
and 
 "has no more CORRUPT files" otherwise.

> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>
>                 Key: HDFS-1111
>                 URL: https://issues.apache.org/jira/browse/HDFS-1111
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Rodrigo Schmidt
>            Assignee: Rodrigo Schmidt
>         Attachments: HADFS-1111.0.patch, HDFS-1111-y20.1.patch
>
>
> If the list of corruptfiles returned by the namenode doesn't say anything if the number
of corrupted files is larger than the call output limit (which means the list is not complete).
There should be a way to hint incompleteness to clients.
> A simple hack would be to add an extra entry to the array returned with the value null.
Clients could interpret this as a sign that there are other corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more confident when the
list is not complete and less confident when the list is known to be incomplete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message