Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 95003 invoked from network); 28 Sep 2010 22:59:57 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 Sep 2010 22:59:57 -0000 Received: (qmail 57880 invoked by uid 500); 28 Sep 2010 22:59:57 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 57827 invoked by uid 500); 28 Sep 2010 22:59:57 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 57707 invoked by uid 99); 28 Sep 2010 22:59:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Sep 2010 22:59:57 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Sep 2010 22:59:55 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o8SMxXPt006340 for ; Tue, 28 Sep 2010 22:59:33 GMT Message-ID: <11531224.453451285714773713.JavaMail.jira@thor> Date: Tue, 28 Sep 2010 18:59:33 -0400 (EDT) From: "Hairong Kuang (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-1348) Improve NameNode reponsiveness while it is checking if datanode decommissions are complete In-Reply-To: <20018130.456261282255456719.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-1348: -------------------------------- Status: Patch Available (was: Open) > Improve NameNode reponsiveness while it is checking if datanode decommissions are complete > ------------------------------------------------------------------------------------------ > > Key: HDFS-1348 > URL: https://issues.apache.org/jira/browse/HDFS-1348 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.22.0 > > Attachments: decomissionImp1.patch, decomissionImp2.patch, decommission.patch, decommission1.patch > > > NameNode normally is busy all the time. Its log is full of activities every second. But once for a while, NameNode seems to pause for more than 10 seconds without doing anything, leaving a blank in its log even though no garbage collection is happening. All other requests to NameNode are blocked when this is happening. > One culprit is DecommionManager. Its monitor holds the fsynamesystem lock during the whole process of checking if decomissioning DataNodes are finished or not, during which it checks every block of up to a default of 5 datanodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.