hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer
Date Wed, 04 Jan 2017 21:27:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15799394#comment-15799394
] 

Daryn Sharp commented on HDFS-7967:
-----------------------------------

Removing stale 2.8 patch (based on earlier version), will repost shortly.

> Reduce the performance impact of the balancer
> ---------------------------------------------
>
>                 Key: HDFS-7967
>                 URL: https://issues.apache.org/jira/browse/HDFS-7967
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7967-branch-2.patch
>
>
> The balancer needs to query for blocks to move from overly full DNs.  The block lookup
is extremely inefficient.  An iterator of the node's blocks is created from the iterators
of its storages' blocks.  A random number is chosen corresponding to how many blocks will
be skipped via the iterator.  Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring imbalances within
the nodes's storages.  A more efficient and intelligent design may eliminate the costly skipping
of blocks via round-robin selection of blocks from the storages based on remaining capacity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message