hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7967) Reduce the performance impact of the balancer
Date Tue, 10 Jan 2017 15:12:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815227#comment-15815227

Kihwal Lee commented on HDFS-7967:

I have reviewed the patch and am fine with it. We have used a variant of this over a year
and this is the latest improved version. 

We can commit this to branch-2 and 2.8 now, but then we will likely forget about the remaining
work and move on.  So I think we need to discuss what we are going to do for trunk. [~daryn],
would you share your thoughts and concerns on the state of trunk and possible solutions?

> Reduce the performance impact of the balancer
> ---------------------------------------------
>                 Key: HDFS-7967
>                 URL: https://issues.apache.org/jira/browse/HDFS-7967
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7967-branch-2.8.patch, HDFS-7967-branch-2.patch, HDFS-7967.branch-2-1.patch,
HDFS-7967.branch-2.001.patch, HDFS-7967.branch-2.002.patch, HDFS-7967.branch-2.8-1.patch,
HDFS-7967.branch-2.8.001.patch, HDFS-7967.branch-2.8.002.patch
> The balancer needs to query for blocks to move from overly full DNs.  The block lookup
is extremely inefficient.  An iterator of the node's blocks is created from the iterators
of its storages' blocks.  A random number is chosen corresponding to how many blocks will
be skipped via the iterator.  Each skip requires costly scanning of triplets.
> The current design also only considers node imbalances while ignoring imbalances within
the nodes's storages.  A more efficient and intelligent design may eliminate the costly skipping
of blocks via round-robin selection of blocks from the storages based on remaining capacity.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message