hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
Date Fri, 23 Jan 2015 20:26:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289901#comment-14289901
] 

Zhe Zhang commented on HDFS-7339:
---------------------------------

bq. First a quick comment about the current SequentialBlockGroupIdGenerator and SequentialBlockIdGenerator.
The current patch tries to use a flag to distinguish contiguous and stripped blocks. However,
since there may still be conflicts coming from historical randomly assigned block ID, for
blocks in block reports, we still to check two places to determine if this is a contiguous
block or a stripped block.
If a block's ID has the 'striped' flag bit, we always _attempt_ to look up the block group
map first. Without rolling upgrade we only need this one lookup. And yes, we do need to check
two places in the worst case. Given that HDFS-4645 will be over 2 years old by the time erasure
coding is released, I guess this won't happen a lot?

> Allocating and persisting block groups in NameNode
> --------------------------------------------------
>
>                 Key: HDFS-7339
>                 URL: https://issues.apache.org/jira/browse/HDFS-7339
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch,
HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they are formed
in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}}
is created to record the original and parity blocks in a coding group, as well as a pointer
to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping
layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore
we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes,
with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs)
is added, which remains empty for “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}}
component; the attached figure has an illustration of the architecture. As a simple example,
when a {_Striping+EC_} file is created and written to, it will serve requests from the client
to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase,
{{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication
to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery
work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message