hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4430) Namenode Web UI capacity report is inconsistent with Balancer
Date Fri, 17 Oct 2008 20:59:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640662#action_12640662
] 

Hairong Kuang commented on HADOOP-4430:
---------------------------------------

1. DatanodeInfo.append() line 190: u should be nonDFSUsed.
2. FSNamesystem.getCapacityUsedNodnDFS() line 3306: return should be out of the synchronized
block.
3. FSnamesystem.getCapacityRemainingPercent() line 3324: the calculation is not consistent
with that in DatanodeInfo.getRemainingPercent().

> Namenode Web UI capacity report is inconsistent with Balancer
> -------------------------------------------------------------
>
>                 Key: HADOOP-4430
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4430
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.19.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4430.patch
>
>
> Solution to 2816 changed
> - Total Capacity definition from (the disk space of all data directories) to (the disk
space of all the data directories - the reserved space)
> - We added a new element Present Capacity to the report. It is set to (Used Capacity
+ Remaining Capacity)
> - We changed the Used Percentage reported from (Used Capacity)/(Total Capacity) to (Used
Capacity)/(Present Capacity)
> - All these changes are displayed on Namenode Web UI.
> Balancer functionality
> Balancer script is started with a threshold parameter. It tries to move the blocks from
the nodes that have Used % that is more than (Cluster average + threshold) to the nodes that
have less than (Cluster average - threshold). Essentially balancer gets all the datanodes
used % to with in (the Cluster average +/- threshold).
> Inconsistencies due to the change in 2816
> When MapReduce jobs are run, temporary files are generated. This eats away a lot of space
from Present Capacity. The difference between the Total Capacity and the Present Capacity
can be huge. Currently balancer computes Used Percentage based (Used Capacity)/(Total Capacity).
The Used % the balancer uses could be significantly different from Used % displayed on the
Namenode Web UI. When balancer is done balancing, the Namenode Used % might still appear unbalanced.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message