Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-dev@hadoop.apache.org
Message-ID: <715800449.1227636464329.JavaMail.jira@brutus>
Date: Tue, 25 Nov 2008 10:07:44 -0800 (PST)
From: "Tsz Wo (Nicholas), SZE (JIRA)" <jira@apache.org>
To: core-dev@hadoop.apache.org
Subject: [jira] Updated: (HADOOP-4061) Large number of decommission freezes
 the Namenode
In-Reply-To: <930258889.1220465144195.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HADOOP-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tsz Wo (Nicholas), SZE updated HADOOP-4061:
-------------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Incompatible change, Reviewed]  (was: [Reviewed, Incompatible change])
          Status: Resolved  (was: Patch Available)

I just committed this.

> Large number of decommission freezes the Namenode
> -------------------------------------------------
>
>                 Key: HADOOP-4061
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4061
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.2
>            Reporter: Koji Noguchi
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: 4061_20081119.patch, 4061_20081120.patch, 4061_20081120b.patch, 4061_20081123.patch, 4061_20081124.patch, 4061_20081124b.patch, 4061_20081124c.patch, 4061_20081124c_0.18.patch, HADOOP-4061.patch
>
>
> On 1900 nodes cluster, we tried decommissioning 400 nodes with 30k blocks each. Other 1500 nodes were almost empty.
> When decommission started, namenode's queue overflowed every 6 minutes.
> Looking at the cpu usage,  it showed that every 5 minutes org.apache.hadoop.dfs.FSNamesystem$DecommissionedMonitor thread was taking 100% of the CPU for 1 minute causing the queue to overflow.
> {noformat}
>   public synchronized void decommissionedDatanodeCheck() {
>     for (Iterator<DatanodeDescriptor> it = datanodeMap.values().iterator();
>          it.hasNext();) {
>       DatanodeDescriptor node = it.next();
>       checkDecommissionStateInternal(node);
>     }
>   }
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.