hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Oriani (JIRA) <j...@apache.org>
Subject [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete
Date Mon, 26 Apr 2010 22:49:32 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861153#action_12861153

André Oriani commented on HDFS-1111:

[ I started writing  before Rodrigo's second comment]

After resolving HDFS-1031 and having acquired more experience on the problem, I think:

1) Saying "Here are a few files that may be corrupted:" is definitely wrong. The files are
corrupted for sure. The doubt is whether the list is complete or not.
2) The output is not sorted. A sorted output would make admin's life easier  (new Jira?)
3) fsck handles options in a very simple way.  'fsck / -move -delete" is accepted although
move and delete options are mutually exclusive (new Jira?)
4) Changes here shall be reflected on HDFS-1031 ( add a link to it?)
5) @ Rodrigo : why not returning a struct comprised of the list and a flag to say if list
is complete. I dunno if it is the best thing to do, but at least it will let things more intuitive

> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>                 Key: HDFS-1111
>                 URL: https://issues.apache.org/jira/browse/HDFS-1111
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Rodrigo Schmidt
>            Assignee: Rodrigo Schmidt
> If the list of corruptfiles returned by the namenode doesn't say anything if the number
of corrupted files is larger than the call output limit (which means the list is not complete).
There should be a way to hint incompleteness to clients.
> A simple hack would be to add an extra entry to the array returned with the value null.
Clients could interpret this as a sign that there are other corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more confident when the
list is not complete and less confident when the list is known to be incomplete.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message