hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete
Date Thu, 02 Sep 2010 01:01:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905361#action_12905361

dhruba borthakur commented on HDFS-1111:

I had an extended offline discussion with Konstantin. The meat of the conversation is that
RaidNode is designed to poll to find missing blocks in HDFS very frequently. This means that
invokign this feature via a servlet is going to be a resource bottleneck for the RaidNode.
It would be really elegant for the RaidNode to be able to invoke a method in the DistributedFileSystem
to find missing blocks. Another point of discussion is that "fsck" is yet another tool that
that can best use the existing APIs (via DistributedFileSystem) rather than using internal
interfaces in FSNamesystem.

I like the elegance of the new API (much better than the existing interface that is being
deleted) and if we can add it to the DistributedFileSystem then that will be great!

> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>                 Key: HDFS-1111
>                 URL: https://issues.apache.org/jira/browse/HDFS-1111
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 0.22.0
>            Reporter: Rodrigo Schmidt
>            Assignee: Sriram Rao
>             Fix For: 0.22.0
>         Attachments: HADFS-1111.0.patch, HDFS-1111-y20.1.patch, HDFS-1111-y20.2.patch,
HDFS-1111.trunk.1.patch, HDFS-1111.trunk.patch
> If the list of corruptfiles returned by the namenode doesn't say anything if the number
of corrupted files is larger than the call output limit (which means the list is not complete).
There should be a way to hint incompleteness to clients.
> A simple hack would be to add an extra entry to the array returned with the value null.
Clients could interpret this as a sign that there are other corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more confident when the
list is not complete and less confident when the list is known to be incomplete.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message