hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3070) HDFS balancer doesn't ensure that hdfs-site.xml is loaded
Date Mon, 02 Apr 2012 06:32:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244002#comment-13244002
] 

Aaron T. Myers commented on HDFS-3070:
--------------------------------------

Hi amith, that does seem to be true, though I'm not sure that it's a strict requirement that
we support only setting fs.default.name at this point in time, since the ability to set the
NN address via other configuration settings has existed for several releases. My personal
opinion is that we should make the NN (and balancer, etc) not ever use fs.default.name as
an indicator of the NN service bind address, but rather only as a client-side URI to use when
a full FS URI is not given. Ideally we would have a system of deprecation which signals that
fs.default.name is being used as the desired bind address when the NN address is configured
in no other way, but as it stands our config deprecation system is only able to show warnings
deprecating a named key in favor of another key.

Regardless, would you like to open a new JIRA to address this issue, amith?
                
> HDFS balancer doesn't ensure that hdfs-site.xml is loaded
> ---------------------------------------------------------
>
>                 Key: HDFS-3070
>                 URL: https://issues.apache.org/jira/browse/HDFS-3070
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 2.0.0
>            Reporter: Stephen Chu
>            Assignee: Aaron T. Myers
>             Fix For: 2.0.0
>
>         Attachments: HDFS-3070.patch, unbalanced_nodes.png, unbalanced_nodes_inservice.png
>
>
> I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, both have
over 3% disk usage.
> Attached is a screenshot of the Live Nodes web UI.
> On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see the blocks
being balanced across all 4 datanodes (all blocks on styx01 and styx02 stay put).
> HA is currently enabled.
> [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
> active
> [schu@styx01 ~]$ hdfs balancer -threshold 1
> 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
> 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
> 12/03/08 10:10:32 INFO balancer.Balancer: p         = Balancer.Parameters[BalancingPolicy.Node,
threshold=1.0]
> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being
Moved
> Balancing took 95.0 milliseconds
> [schu@styx01 ~]$ 
> I believe with a threshold of 1% the balancer should trigger blocks being moved across
DataNodes, right? I am curious about the "namenode = []" from the above output.
> [schu@styx01 ~]$ hadoop version
> Hadoop 0.24.0-SNAPSHOT
> Subversion git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
-r f6a577d697bbcd04ffbc568167c97b79479ff319
> Compiled by schu on Thu Mar  8 15:32:50 PST 2012
> From source with checksum ec971a6e7316f7fbf471b617905856b8
> From http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
> The threshold parameter is a fraction in the range of (0%, 100%) with a default value
of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced
if for each datanode, the utilization of the node (ratio of used space at the node to total
capacity of the node) differs from the utilization of the (ratio of used space in the cluster
to total capacity of the cluster) by no more than the threshold value. The smaller the threshold,
the more balanced a cluster will become. It takes more time to run the balancer for small
threshold values. Also for a very small threshold the cluster may not be able to reach the
balanced state when applications write and delete files concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message