Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 49039 invoked from network); 29 Jan 2009 06:02:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Jan 2009 06:02:29 -0000 Received: (qmail 1753 invoked by uid 500); 29 Jan 2009 06:02:22 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 1717 invoked by uid 500); 29 Jan 2009 06:02:22 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 1706 invoked by uid 99); 29 Jan 2009 06:02:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jan 2009 22:02:22 -0800 X-ASF-Spam-Status: No, hits=-1998.8 required=10.0 tests=ALL_TRUSTED,FS_REPLICA X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Jan 2009 06:02:20 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A094F234C4AB for ; Wed, 28 Jan 2009 22:01:59 -0800 (PST) Message-ID: <1549115129.1233208919656.JavaMail.jira@brutus> Date: Wed, 28 Jan 2009 22:01:59 -0800 (PST) From: "Hairong Kuang (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-5034) NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat In-Reply-To: <377040628.1231961462146.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HADOOP-5034: ---------------------------------- Status: Patch Available (was: Open) > NameNode should send both replication and deletion requests to DataNode in one reply to a heartbeat > --------------------------------------------------------------------------------------------------- > > Key: HADOOP-5034 > URL: https://issues.apache.org/jira/browse/HADOOP-5034 > Project: Hadoop Core > Issue Type: New Feature > Components: dfs > Affects Versions: 0.18.0 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.19.1 > > Attachments: blockTransferInvalidate.patch, blockTransferInvalidate1.patch, blockTransferInvalidate2.patch, blockTransferInvalidate3.patch > > > Currently NameNode favors block replication requests over deletion requests. On reply to a heartbeat, NameNode does not send a block deletion request unless there is no block replication request. > This brings a problem when a near-full cluster loses a bunch of DataNodes. In react to the DataNode loss, NameNode starts to replicate blocks. However, replication takes a lot of cpu and a lot of replications fail because of the lack of disk space. So the administrator tries to delete some DFS files to free up space. However, block deletion requests get delayed for very long time because it takes a long time to drain the block replication requests for most DataNodes. > I'd like to propose to let NameNode to send both replication requests and deletion requests to DataNodes in one reply to a heartbeat. This also implies that the replication monitor should schedule both replication and deletion work in one iteration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.