hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7068) Support multiple block placement policies
Date Wed, 11 Mar 2015 14:25:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356926#comment-14356926

Kai Zheng commented on HDFS-7068:

Looking at your 3 options, I'm wondering if we could do it in lighter way. In my understanding,
if the file is in replication mode as by default, then we'll go to the current block placement
policy as it goes currently in trunk; otherwise, if stripping and/or ec is involved, then
we have a new single customized placement policy to cover all the related cases. This new
placement policy would use the extended storage policy and the associated ec schema info to
implement the concrete placement logic. At this initial phase, we might not create and configure
each new placement policy for each ec code. The basic thinking would be enough that we just
try to place parity blocks in different racks or nodes, whatever erasure code it is. When
appropriate with more inputs, we can enhance the new placement policy later. As discussed
in HDFS-7613, we implement RS code by default. Please ignore XOR stuff as it's just for testing.

> Support multiple block placement policies
> -----------------------------------------
>                 Key: HDFS-7068
>                 URL: https://issues.apache.org/jira/browse/HDFS-7068
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.5.1
>            Reporter: Zesheng Wu
>            Assignee: Walter Su
>         Attachments: HDFS-7068.patch
> According to the code, the current implement of HDFS only supports one specific type
of block placement policy, which is BlockPlacementPolicyDefault by default.
> The default policy is enough for most of the circumstances, but under some special circumstances,
it works not so well.
> For example, on a shared cluster, we want to erasure encode all the files under some
specified directories. So the files under these directories need to use a new placement policy.
> But at the same time, other files still use the default placement policy. Here we need
to support multiple placement policies for the HDFS.
> One plain thought is that, the default placement policy is still configured as the default.
On the other hand, HDFS can let user specify customized placement policy through the extended
attributes(xattr). When the HDFS choose the replica targets, it firstly check the customized
placement policy, if not specified, it fallbacks to the default one. 
> Any thoughts?

This message was sent by Atlassian JIRA

View raw message