hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
Date Mon, 26 Jan 2015 22:07:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292496#comment-14292496
] 

Zhe Zhang commented on HDFS-7339:
---------------------------------

Thanks for the analysis [~szetszwo].

The basic tradeoff is the compactness of ID space versus lookup overhead. I agree option #1
should be ruled out (most compact allocation, slowest lookup).

>From options #2~#5 the trend is sparser ID allocation; more invariants are guaranteed
as a benefit.

However, it seems all of them require an additional lookup (either in {{blocksMap}} or in
the map of inodes) to identify a non-EC block? For example, when a block report for *0x331*
arrives, we don't know if it's a non-EC block, or an EC block in the group *0x330*. So we
must lookup {{blocksMap}} for *0x330* and get a miss or find the inode and obtain the storage
policy.

Whereas separating the ID space with a binary flag leads to 1 lookup (except for legacy, randomly
generated block IDs).



> Allocating and persisting block groups in NameNode
> --------------------------------------------------
>
>                 Key: HDFS-7339
>                 URL: https://issues.apache.org/jira/browse/HDFS-7339
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch,
HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they are formed
in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}}
is created to record the original and parity blocks in a coding group, as well as a pointer
to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping
layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore
we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes,
with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs)
is added, which remains empty for “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}}
component; the attached figure has an illustration of the architecture. As a simple example,
when a {_Striping+EC_} file is created and written to, it will serve requests from the client
to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase,
{{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication
to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery
work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message