Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 78841 invoked from network); 2 Mar 2007 21:54:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Mar 2007 21:54:12 -0000 Received: (qmail 27348 invoked by uid 500); 2 Mar 2007 21:54:20 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 27315 invoked by uid 500); 2 Mar 2007 21:54:20 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 27305 invoked by uid 99); 2 Mar 2007 21:54:20 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Mar 2007 13:54:20 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Mar 2007 13:54:11 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id D562371431E for ; Fri, 2 Mar 2007 13:53:50 -0800 (PST) Message-ID: <4384790.1172872430871.JavaMail.jira@brutus> Date: Fri, 2 Mar 2007 13:53:50 -0800 (PST) From: "Hadoop QA (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-994) DFS Scalability : a BlockReport that returns large number of blocks-to-be-deleted cause datanode to lost connectivity to namenode In-Reply-To: <30468076.1170962106862.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12477464 ] Hadoop QA commented on HADOOP-994: ---------------------------------- +1, because http://issues.apache.org/jira/secure/attachment/12352069/blockReportInvalidateBlock.patch applied and successfully tested against trunk revision http://svn.apache.org/repos/asf/lucene/hadoop/trunk/513935. Results are at http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch > DFS Scalability : a BlockReport that returns large number of blocks-to-be-deleted cause datanode to lost connectivity to namenode > --------------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-994 > URL: https://issues.apache.org/jira/browse/HADOOP-994 > Project: Hadoop > Issue Type: Bug > Components: dfs > Reporter: dhruba borthakur > Assigned To: dhruba borthakur > Attachments: blockReportInvalidateBlock.patch > > > The Datanode periodically invokes a block report RPC to the Namenode. This RPC returns the number of blocks that are to be invalidated by the Datanode. The Datanode then starts to delete all the corresponding files. This block deletion is done by the heartbeat thread in the Datanode. If the number of files to be deleted is large, the Datanode stops sending heartbeats for this entire duration. The Namenode declares the Datanode as "dead" and starts replicating its blocks. > In my observed case, the block report returns 1669 blocks that were to be invalidated. The Datanode was running on a RAID5 ext3 filesystem and 4 active tasks were running on it. The deletion of these 1669 files took about 30 minutes, Wow! The average disk service time during this period was less than 10 ms. The Datanode was using about 30% CPU during this time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.