Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 1042 invoked from network); 25 Jun 2010 22:19:13 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Jun 2010 22:19:13 -0000 Received: (qmail 16061 invoked by uid 500); 25 Jun 2010 22:19:13 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 16004 invoked by uid 500); 25 Jun 2010 22:19:13 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 15996 invoked by uid 99); 25 Jun 2010 22:19:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jun 2010 22:19:13 +0000 X-ASF-Spam-Status: No, hits=-1546.0 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Jun 2010 22:19:12 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o5PMIpqc023441 for ; Fri, 25 Jun 2010 22:18:52 GMT Message-ID: <13225221.67791277504331788.JavaMail.jira@thor> Date: Fri, 25 Jun 2010 18:18:51 -0400 (EDT) From: "Konstantin Shvachko (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Commented: (HDFS-1111) getCorruptFiles() should give some hint that the list is not complete In-Reply-To: <8815800.16581272314313876.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882741#action_12882741 ] Konstantin Shvachko commented on HDFS-1111: ------------------------------------------- I was addressing the example you gave. > Wee ran fsck -list-corruptfiles /path/to/important/dir but it returned an empty list. > The problem was that we filter the directory after we get the list reported from the namenode and the list is limited. What do you print if the list is empty, but incomplete? Looks like it is going to be only the message {{ATTENTION: List of corrupted files returned from namenode was INCOMPLETE.}} and no list. I think this is confusing. And the only one way to get it done is to pass the directory path to {{getCorruptFiles("/path/to/important/dir")}}. I browsed through the issues dedicated to the -list-corruptfiles option for fsck. - First of all, I think it should have been done in one issue with a proper design of the new feature and all UI / API issues thought through in advance, rather than doing it gradually. I sure don't know whether it could have been done that way, but it seems more convenient to discuss everything in one place rather than jumping all over around. - Second of all, as a result of that (I believe) there was introduced an unnecessary {{ClientProtocol}} method: {{getCorruptFiles()}}, which is being modified here also unnecessary. {{ClientProtocol}} changes are not necessary because fsck is works over http rather than via rpc. {{NamenodeFsck}} - a part of {{FsckServlet}} calls name-node methods directly, rather than through rpc. Therefore, {{ClientProtocol}} has nothing to do with this. For example, {{getBlockLocationsNoATime()}} is not in {{ClientProtocol}}. The same should be with {{getCorruptFiles()}}. So I propose to remove {{getCorruptFiles()}} from {{ClientProtocol}} in this jira instead of modifying it... And then I won't argue about printout messages anymore... I believe you will still need HDFS-1265 to deal with your example correctly. > getCorruptFiles() should give some hint that the list is not complete > --------------------------------------------------------------------- > > Key: HDFS-1111 > URL: https://issues.apache.org/jira/browse/HDFS-1111 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Rodrigo Schmidt > Assignee: Rodrigo Schmidt > Attachments: HADFS-1111.0.patch > > > If the list of corruptfiles returned by the namenode doesn't say anything if the number of corrupted files is larger than the call output limit (which means the list is not complete). There should be a way to hint incompleteness to clients. > A simple hack would be to add an extra entry to the array returned with the value null. Clients could interpret this as a sign that there are other corrupt files in the system. > We should also do some rephrasing of the fsck output to make it more confident when the list is not complete and less confident when the list is known to be incomplete. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.