Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 51709 invoked from network); 24 Mar 2006 00:56:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 24 Mar 2006 00:56:41 -0000 Received: (qmail 40177 invoked by uid 500); 24 Mar 2006 00:56:41 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 40156 invoked by uid 500); 24 Mar 2006 00:56:40 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 40147 invoked by uid 99); 24 Mar 2006 00:56:40 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Mar 2006 16:56:40 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Mar 2006 16:56:40 -0800 Received: from ajax (localhost.localdomain [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id 5865C6ACA9 for ; Fri, 24 Mar 2006 00:56:19 +0000 (GMT) Message-ID: <1014579225.1143161779359.JavaMail.jira@ajax> Date: Fri, 24 Mar 2006 00:56:19 +0000 (GMT) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-101) DFSck - fsck-like utility for checking DFS volumes In-Reply-To: <958365348.1143139938715.JavaMail.jira@ajax> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-101?page=comments#action_12371659 ] Doug Cutting commented on HADOOP-101: ------------------------------------- I like that this does not use anything more than the client API to check the server. That keeps the server core lean and mean. The use of RPC's effectively restricts the impact of the scan on the FS. A datanode operation that streams through a block without transferring it over the wire won't correctly check checksums using our existing mechanism. To check file content we could instead simply implement a map-reduce job that streams through all the files in the fs. This would not take much code: nothing additional in the core. MapReduce should handle the locality, so that most data shouldn't go over the wire. BTW, blocks not used by any file are not known to the name node, are they? When they're reported by a datanode the datanode is told to remove them. > DFSck - fsck-like utility for checking DFS volumes > -------------------------------------------------- > > Key: HADOOP-101 > URL: http://issues.apache.org/jira/browse/HADOOP-101 > Project: Hadoop > Type: New Feature > Components: dfs > Versions: 0.2 > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Attachments: DFSck.java > > This is a utility to check health status of a DFS volume, and collect some additional statistics. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira