hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-12008) Improve the available-space block placement policy
Date Thu, 22 Jun 2017 23:19:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Anu Engineer updated HDFS-12008:
    Attachment: RandomAllocationPolicy.png

[~kihwal] Thank you for posting the results of your testing. Since you were wondering about
the algorithm , I am a attaching a screen shot from the SPLAD paper that discusses how this
works in real life. 

The first paper studies the effect of this policy on long term data storage. I am not claiming
it applies directly to available-space block placement policy -- but more of an impact of
what "double random" gives us. The second paper explores the idea of "Double Random choice".

Here are the Original papers that talk about the effect of this kind of allocation, and a
survey paper that talks about this technique.  

1. SPLAD: scattering and placing data replicas to enhance long-term durability 

2. The Power of Two Random Choices: A Survey of Techniques and Results

We use very similar algorithms in Ozone -- but at the cluster scale so we are not forced to
run balancer all the time. 

> Improve the available-space block placement policy
> --------------------------------------------------
>                 Key: HDFS-12008
>                 URL: https://issues.apache.org/jira/browse/HDFS-12008
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: block placement
>    Affects Versions: 2.8.1
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-12008.patch, RandomAllocationPolicy.png
> AvailableSpaceBlockPlacementPolicy currently picks two nodes unconditionally, then picks
one node. It could avoid picking the second node when not necessary.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message