Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 74428 invoked from network); 1 Sep 2010 23:45:33 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Sep 2010 23:45:33 -0000 Received: (qmail 34419 invoked by uid 500); 1 Sep 2010 23:45:33 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 34370 invoked by uid 500); 1 Sep 2010 23:45:33 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 34362 invoked by uid 99); 1 Sep 2010 23:45:33 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Sep 2010 23:45:33 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Sep 2010 23:45:15 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o81NisTu025480 for ; Wed, 1 Sep 2010 23:44:54 GMT Message-ID: <10074106.128081283384694050.JavaMail.jira@thor> Date: Wed, 1 Sep 2010 19:44:54 -0400 (EDT) From: "Hairong Kuang (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-1348) Improve NameNode reponsiveness while it is checking if datanode decommissions are complete In-Reply-To: <20018130.456261282255456719.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-1348: -------------------------------- Attachment: decommission1.patch This patch addressed Dmytro's review comment. > Improve NameNode reponsiveness while it is checking if datanode decommissions are complete > ------------------------------------------------------------------------------------------ > > Key: HDFS-1348 > URL: https://issues.apache.org/jira/browse/HDFS-1348 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.22.0 > > Attachments: decommission.patch, decommission1.patch > > > NameNode normally is busy all the time. Its log is full of activities every second. But once for a while, NameNode seems to pause for more than 10 seconds without doing anything, leaving a blank in its log even though no garbage collection is happening. All other requests to NameNode are blocked when this is happening. > One culprit is DecommionManager. Its monitor holds the fsynamesystem lock during the whole process of checking if decomissioning DataNodes are finished or not, during which it checks every block of up to a default of 5 datanodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.