Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 49905 invoked from network); 24 Feb 2009 17:40:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Feb 2009 17:40:28 -0000 Received: (qmail 42125 invoked by uid 500); 24 Feb 2009 17:40:25 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 42095 invoked by uid 500); 24 Feb 2009 17:40:25 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 42084 invoked by uid 99); 24 Feb 2009 17:40:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Feb 2009 09:40:25 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Feb 2009 17:40:22 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 14424234C4AA for ; Tue, 24 Feb 2009 09:40:02 -0800 (PST) Message-ID: <1154875008.1235497202081.JavaMail.jira@brutus> Date: Tue, 24 Feb 2009 09:40:02 -0800 (PST) From: "Raghu Angadi (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Issue Comment Edited: (HADOOP-4584) Slow generation of blockReport at DataNode causes delay of sending heartbeat to NameNode In-Reply-To: <178281774.1225759904336.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676344#action_12676344 ] rangadi edited comment on HADOOP-4584 at 2/24/09 9:39 AM: --------------------------------------------------------------- Ideally there is no requirement for block reports. It is essentially to used as 'catch all' for various bugs and errors. (of course, it is now overloaded with job of informing about deletions to NameNode and should be separated). Yes, it specifically removes disk scan without fundamentally changing meaning of block reports. Now DN informs NameNode about the the block that it thinks it has. because : * 'rm -r' by admin is just one form of many many things that can go wrong with blocks on datanode. There is no perticular reason we should have this very costly disk scan (with a global lock held) just for this. ** In fact 'rm -r' is probably the least probable error (haven't seen even once in practice). * We have periodic block verification that does handle various things that can go wrong with a block (it can improve further). ** So 'rm -r' will be handled, just at the rate of rest of the block problems. * on the other hand many users have complained about datanode scans taking 10s of minutes and making datanodes lose heartbeats. ** This makes the system pretty unusable and a major obstruction for graceful degradation under load and for scalability. ** One can argue that those users should not have so many blocks. But I think DN should still handle it to the best of it abilities and not die on them. ** Disks might be slow for many other reasons (other tasks on the machine, etc). * I think this is orthogonal to HADOOP-1079 since it addresses RPC and NameNode overhead of block reports. This jira is only about DataNode side. Yes, this is a bigger change in semantics than what we proposed earlier : to scan the directories slowly, without holding the global lock... but offline scan looks like a workaround for a problem that does not need to be solved. Not scanning is much simpler than handling offline scan. Eventually we need to reduce the frequency of block reports.. this can be done as soon as we add acks for block deletions. This JIRA is major step in that direction. was (Author: rangadi): Ideally there is no requirement for block reports. It is essentially to used as 'catch all' for various bugs and errors. (of course, it is now overloaded with job of informing about deletions to NameNode, this should removed). Yes, it specifically removes disk scan without fundamentally changing meaning of block reports. Now DN informs NameNode about the the block that it thinks it had. because : * 'rm -r' by admin is just one form of many many things that can go wrong with blocks on datanode. There is no perticular reason we should have this very costly disk scan (with a global lock held) just for this. ** In fact 'rm -r' is probably the least probable error (haven't seen even once in practice). * We have periodic block verification that does handle various things that can go wrong with a block (it can improve further). ** So 'rm -r' will be handled, just at the rate of rest of the block problems. * on the other hand many users have complained about datanode scans taking 10s of minutes and making datanodes lose heartbeats. ** This makes the system pretty unusable and a major obstruction for graceful degradation under load and for scalability. ** One can argue that those users should not have so many blocks. But I think DN should still handle it to the best of it abilities and not die on them. ** Disks might be slow for many other reasons (other tasks on the machine, etc). * I think this is orthogonal to HADOOP-1079 since it addresses RPC and NameNode overhead of block reports. This jira is only about DataNode side. Yes, this is a bigger change in semantics than what we proposed earlier : to scan the directories slowly, without holding the global lock... but offline scan looks like a workaround for a problem that does not need to be solved. Not scanning is much simpler than handling offline scan. Eventually we need to reduce the frequency of block reports.. this can be done as soon as we add acks for block deletions. This JIRA is major step in that direction. > Slow generation of blockReport at DataNode causes delay of sending heartbeat to NameNode > ---------------------------------------------------------------------------------------- > > Key: HADOOP-4584 > URL: https://issues.apache.org/jira/browse/HADOOP-4584 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Reporter: Hairong Kuang > Assignee: Suresh Srinivas > Fix For: 0.20.0 > > Attachments: 4584.patch, 4584.patch, 4584.patch, 4584.patch, 4584.patch, 4584.patch > > > sometimes due to disk or some other problems, datanode takes minutes or tens of minutes to generate a block report. It causes the datanode not able to send heartbeat to NameNode every 3 seconds. In the worst case, it makes NameNode to detect a lost heartbeat and wrongly decide that the datanode is dead. > It would be nice to have two threads instead. One thread is for scanning data directories and generating block report, and executes the requests sent by NameNode; Another thread is for sending heartbeats, block reports, and picking up the requests from NameNode. By having these two threads, the sending of heartbeats will not get delayed by any slow block report or slow execution of NameNode requests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.