hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rodrigo Schmidt (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete
Date Fri, 30 Jul 2010 00:17:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893872#action_12893872

Rodrigo Schmidt commented on HDFS-1111:

Getting a continuous flow of corrupted blocks is a hard problem because these data structures
can change in between calls. Part of the long discussion on HDFS-729 was about that.
I still think that removing the ClientProtocol changes is a step backwards. Instead of taking
it out, it is better to add it to the Hdfs or DistributedFileSystem API and allow other services
to query it. The best example I have of a service that could benefit from it is HDFS-RAID.
RAID is in an very advanced stage now. Ram and Scott have just implemented Reed-Solomon codes.
Not having an API to query corrupted blocks directly makes simple things very difficult.

I also don't think it's a good idea to solve different and orthogonal issues on the same JIRA.
I think JIRAs should be as small and simple as possible to make discussions and reviews easier.

> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>                 Key: HDFS-1111
>                 URL: https://issues.apache.org/jira/browse/HDFS-1111
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Rodrigo Schmidt
>            Assignee: Rodrigo Schmidt
>         Attachments: HADFS-1111.0.patch
> If the list of corruptfiles returned by the namenode doesn't say anything if the number
of corrupted files is larger than the call output limit (which means the list is not complete).
There should be a way to hint incompleteness to clients.
> A simple hack would be to add an extra entry to the array returned with the value null.
Clients could interpret this as a sign that there are other corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more confident when the
list is not complete and less confident when the list is known to be incomplete.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message