hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rodrigo Schmidt (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete
Date Fri, 25 Jun 2010 04:34:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882437#action_12882437
] 

Rodrigo Schmidt commented on HDFS-1111:
---------------------------------------

If the list is empty, we don't print the "ALL/A FEW" line:

{code}
        if (matchedCorruptFilesCount == 1 ) {
          out.println("Here are" + filler + "corrupted files:");
          out.println("===========================================");
        }
{code}

If the list is incomplete (empty or not), we print out a header alerting the user:

{code}
    if (!corruptFileStatuses.isComplete()) {
      filler = " A FEW ";
      out.println("\n\nATTENTION: List of corrupted files returned from" +
                  " namenode was INCOMPLETE.\n\n");
    }
{code}

I thought that would be enough but I'm open to other options.
What else would you like to add?

As for creating JIRAs, I guess was just trying to be a little proactive, but maybe it was
wrong. My rationale was the following:
1) These are orthogonal problems (incomplete lists, and server-side filtering)
2) The current patch for this JIRA is already long and complicated. Extending it would increase
the chances of introducing bugs.
3) Blocking one change should not necessarily block the other, thus calling for a separate
JIRA.

I assigned HDFS-1265 to me because I've been dealing with the getCorruptFiles() API since
its creation. I assumed I would be probably the one working on it anyway, though I don't plan
to do this in the next few days. If you think it should be left unassigned or deleted, I don't
mind. 




> getCorruptFiles() should give some hint that the list is not complete
> ---------------------------------------------------------------------
>
>                 Key: HDFS-1111
>                 URL: https://issues.apache.org/jira/browse/HDFS-1111
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Rodrigo Schmidt
>            Assignee: Rodrigo Schmidt
>         Attachments: HADFS-1111.0.patch
>
>
> If the list of corruptfiles returned by the namenode doesn't say anything if the number
of corrupted files is larger than the call output limit (which means the list is not complete).
There should be a way to hint incompleteness to clients.
> A simple hack would be to add an extra entry to the array returned with the value null.
Clients could interpret this as a sign that there are other corrupt files in the system.
> We should also do some rephrasing of the fsck output to make it more confident when the
list is not complete and less confident when the list is known to be incomplete.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message