hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8823) Move replication factor into individual blocks
Date Fri, 31 Jul 2015 23:47:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650013#comment-14650013
] 

Zhe Zhang commented on HDFS-8823:
---------------------------------

Thanks Haohui for the pointers; they are very helpful.

I commented on {{storagePolicy}} just because if we plan to store it in BM too, the combined
mem overhead ({{rep factor}} + {{storagePolicy}}) probably won't be (as easily) absorbed by
alignment. And I don't think we'll end up having {{rep factor}} in BM but not {{storagePolicy}}
(pls correct me if I'm wrong). Looks like BM needs both pieces of info to make correct placement
decision.

Given that the majority of blocks will have default {{rep factor}} and {{storagePolicy}},
maybe we can use some deduplication. For example, create a {{CustomizedBlockPolicies}} feature
class and only add it to a {{BlockInfo}} when policies are customized.

> Move replication factor into individual blocks
> ----------------------------------------------
>
>                 Key: HDFS-8823
>                 URL: https://issues.apache.org/jira/browse/HDFS-8823
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>         Attachments: HDFS-8823.000.patch
>
>
> This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes
have two advantages:
> * Decoupling the namespace and the block management layer. It is a prerequisite step
to move block management off the heap or to a separate process.
> * Increased flexibility on replicating blocks. Currently the replication factors of all
blocks have to be the same. The replication factors of these blocks are equal to the highest
replication factor across all snapshots. The changes will allow blocks in a file to have different
replication factor, potentially saving some space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message