hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete
Date Fri, 25 Jun 2010 22:18:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882741#action_12882741

Konstantin Shvachko commented on HDFS-1111:

I was addressing the example you gave. 
> Wee ran fsck -list-corruptfiles /path/to/important/dir but it returned an empty list.
> The problem was that we filter the directory after we get the list reported from the
namenode and the list is limited.

What do you print if the list is empty, but incomplete? Looks like it is going to be only
the message 
{{ATTENTION: List of corrupted files returned from namenode was INCOMPLETE.}}
and no list. I think this is confusing.
And the only one way to get it done is to pass the directory path to {{getCorruptFiles("/path/to/important/dir")}}.

I browsed through the issues dedicated to the -list-corruptfiles option for fsck. 
- First of all, I think it should have been done in one issue with a proper design of the
new feature and all UI / API issues thought through in advance, rather than doing it gradually.
I sure don't know whether it could have been done that way, but it seems more convenient to
discuss everything in one place rather than jumping all over around.
- Second of all, as a result of that (I believe) there was introduced an unnecessary {{ClientProtocol}}
method: {{getCorruptFiles()}}, which is being modified here also unnecessary. 

{{ClientProtocol}} changes are not necessary because fsck is works over http rather than via
rpc. {{NamenodeFsck}} - a part of {{FsckServlet}} calls name-node methods directly, rather
than through rpc. Therefore, {{ClientProtocol}} has nothing to do with this. For example,
{{getBlockLocationsNoATime()}} is not in {{ClientProtocol}}. The same should be with {{getCorruptFiles()}}.

So I propose to remove {{getCorruptFiles()}} from {{ClientProtocol}} in this jira instead
of modifying it...   
And then I won't argue about printout messages anymore...

I believe you will still need HDFS-1265 to deal with your example correctly.

> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>                 Key: HDFS-1111
>                 URL: https://issues.apache.org/jira/browse/HDFS-1111
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Rodrigo Schmidt
>            Assignee: Rodrigo Schmidt
>         Attachments: HADFS-1111.0.patch
> If the list of corruptfiles returned by the namenode doesn't say anything if the number
of corrupted files is larger than the call output limit (which means the list is not complete).
There should be a way to hint incompleteness to clients.
> A simple hack would be to add an extra entry to the array returned with the value null.
Clients could interpret this as a sign that there are other corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more confident when the
list is not complete and less confident when the list is known to be incomplete.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message