hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10289) Balancer configures DNs directly
Date Thu, 14 Apr 2016 15:42:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241375#comment-15241375
] 

Anu Engineer commented on HDFS-10289:
-------------------------------------

As part of diskBalancer work we have added an API that might be useful for you. There is a
DN RPC in HDFS-1312 branch which allows you to query generic properties from DataNode. Please
look at HDFS-9647 if you are interested. if you find this API to be useful for you, you are
most welcome to use HDFS-1312 to develop this feature.

Unfortunately the API is named getDiskBalancerSetting or DiskBalancerSettingRequestProto.
You might want to rename that call to getDatanodeSetting or something to that effect to make
it generic.


> Balancer configures DNs directly
> --------------------------------
>
>                 Key: HDFS-10289
>                 URL: https://issues.apache.org/jira/browse/HDFS-10289
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>    Affects Versions: 2.6.0
>            Reporter: John Zhuge
>            Assignee: John Zhuge
>            Priority: Critical
>
> Balancer directly configures the 2 balance-related properties (bandwidthPerSec and concurrentMoves)
on the DNs involved.
> Details:
> * Before each balancing iteration, set the properties on all DNs involved in the current
iteration.
> * The DN property changes will not survive restart.
> * Balancer gets the property values from command line or its config file.
> * Need new DN APIs to query and set the 2 properties.
> * No need to edit the config file on each DN or run {{hdfs dfsadmin -setBalancerBandwidth}}
to configure every DN in the cluster.
> Pros:
> * Improve ease of use because all configurations are done at one place, the balancer.
We saw many customers often forgot to set concurrentMoves properly since it is required on
both DN and Balancer.
> * Support new DNs added between iterations
> * Handle DN restarts between iterations
> * May be able to dynamically adjust the thresholds in different iterations. Don't know
how useful though.
> Cons:
> * New DN property API
> * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin -setBalancerBandwidth}}
has the same issue. Also Balancer can only be run by admin.
> Questions:
> * Can we create {{BalancerConcurrentMovesCommand}} similar to {{BalancerBandwidthCommand}}?
Can Balancer use them directly without going through NN?
> One proposal to implement HDFS-7466 calls for an API to query DN properties. DN Conf
Servlet returns all config properties. It does not return individual property and it does
not return the value set by {{hdfs dfsadmin -setBalancerBandwidth}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message