Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 31954 invoked from network); 16 Oct 2008 18:42:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Oct 2008 18:42:38 -0000 Received: (qmail 46362 invoked by uid 500); 16 Oct 2008 18:42:37 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 46283 invoked by uid 500); 16 Oct 2008 18:42:37 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 46219 invoked by uid 99); 16 Oct 2008 18:42:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Oct 2008 11:42:36 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Oct 2008 18:41:37 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id AAD37234C239 for ; Thu, 16 Oct 2008 11:41:44 -0700 (PDT) Message-ID: <1915540112.1224182504698.JavaMail.jira@brutus> Date: Thu, 16 Oct 2008 11:41:44 -0700 (PDT) From: "Suresh Srinivas (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-4430) Namenode Web UI capacity report is inconsistent with Balancer In-Reply-To: <1079131676.1224179864791.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640259#action_12640259 ] Suresh Srinivas commented on HADOOP-4430: ----------------------------------------- Proposed solution: - The definition of "Configured Capacity" from 2816 will be retained. - The "DFS Used %" will be changed from (DFS Used)/(Present Capacity) to (DFS Used)/(Configured Capacity) - "Present Capacity" introduced in 2816 should be same as "Configured Capacity", if the MapReduce generated temporary files do not take more than the reserved space. When the temporary files use more than the reserved space, "Present Capacity" reduces proportionally. With this change, "Present Capacity" data is removed. Instead, the extra space exceeding reserved space used by temporary files, is reported as "Non DFS Used" space. - New "DFS Remaining %" will be added to explicitly to indicate remaining % space for DFS used. - Currently a percentage factor, as defined by "dfs.datanode.du.pct", is used to reduce the actual remaining space to calculate the DFS Remaining. This does not serve any purpose (see the comments in 2816). This will be removed. Here are the definition of data reported on the Web UI: Configured Capacity: Disk space corresponding to all the data directories - Reserved space as defined by dfs.datanode.du.reserved DFS Used: Space used by DFS Non DFS Used: 0 if the temporary files do not exceed reserved space. Otherwise this is the size by which temporary files exceed the reserved space and encroach into the DFS configured space. DFS Remaining: (Configured Capacity - DFS Used - Non DFS Used) DFS Used %: (DFS Used / Configured Capacity) * 100 DFS Remaining % = (DFS Remaining / Configured Capacity) * 100 > Namenode Web UI capacity report is inconsistent with Balancer > ------------------------------------------------------------- > > Key: HADOOP-4430 > URL: https://issues.apache.org/jira/browse/HADOOP-4430 > Project: Hadoop Core > Issue Type: Bug > Affects Versions: 0.19.0 > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Fix For: 0.19.0 > > > Solution to 2816 changed > - Total Capacity definition from (the disk space of all data directories) to (the disk space of all the data directories - the reserved space) > - We added a new element Present Capacity to the report. It is set to (Used Capacity + Remaining Capacity) > - We changed the Used Percentage reported from (Used Capacity)/(Total Capacity) to (Used Capacity)/(Present Capacity) > - All these changes are displayed on Namenode Web UI. > Balancer functionality > Balancer script is started with a threshold parameter. It tries to move the blocks from the nodes that have Used % that is more than (Cluster average + threshold) to the nodes that have less than (Cluster average - threshold). Essentially balancer gets all the datanodes used % to with in (the Cluster average +/- threshold). > Inconsistencies due to the change in 2816 > When MapReduce jobs are run, temporary files are generated. This eats away a lot of space from Present Capacity. The difference between the Total Capacity and the Present Capacity can be huge. Currently balancer computes Used Percentage based (Used Capacity)/(Total Capacity). The Used % the balancer uses could be significantly different from Used % displayed on the Namenode Web UI. When balancer is done balancing, the Namenode Used % might still appear unbalanced. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.