hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7339) Allocating and persisting block groups in NameNode
Date Tue, 27 Jan 2015 02:26:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292864#comment-14292864
] 

Jing Zhao commented on HDFS-7339:
---------------------------------

bq. Change the hash function so that consecutive IDs will be mapped to the same hash value
and implement BlockGroup.equal(..) so that it returns true with any block id in the group.

Had an offline discussion with [~szetszwo] about this just now. This new hash function will
cause extra scanning in the bucket, since every 16 contiguous blocks will be mapped to the
same bucket. Currently for a large cluster the blocksMap can contain several million buckets,
which is in the same scale of the total number of blocks. Thus the current implementation
will not have a lot of bucket scan in normal case. Therefore I guess we may need to revisit
this optimization and maybe do a simple benchmark about it.

Back to this jira, maybe we should consider providing a relative simple implementation first
and do optimization in a separate jira. Either only using blocksMap or allocating an extra
blockgroupsMap looks fine to me. Maybe we should also schedule an offline discussion sometime
this week.

> Allocating and persisting block groups in NameNode
> --------------------------------------------------
>
>                 Key: HDFS-7339
>                 URL: https://issues.apache.org/jira/browse/HDFS-7339
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7339-001.patch, HDFS-7339-002.patch, HDFS-7339-003.patch, HDFS-7339-004.patch,
HDFS-7339-005.patch, HDFS-7339-006.patch, Meta-striping.jpg, NN-stripping.jpg
>
>
> All erasure codec operations center around the concept of _block group_; they are formed
in initial encoding and looked up in recoveries and conversions. A lightweight class {{BlockGroup}}
is created to record the original and parity blocks in a coding group, as well as a pointer
to the codec schema (pluggable codec schemas will be supported in HDFS-7337). With the striping
layout, the HDFS client needs to operate on all blocks in a {{BlockGroup}} concurrently. Therefore
we propose to extend a file’s inode to switch between _contiguous_ and _striping_ modes,
with the current mode recorded in a binary flag. An array of BlockGroups (or BlockGroup IDs)
is added, which remains empty for “traditional” HDFS files with contiguous block layout.
> The NameNode creates and maintains {{BlockGroup}} instances through the new {{ECManager}}
component; the attached figure has an illustration of the architecture. As a simple example,
when a {_Striping+EC_} file is created and written to, it will serve requests from the client
to allocate new {{BlockGroups}} and store them under the {{INodeFile}}. In the current phase,
{{BlockGroups}} are allocated both in initial online encoding and in the conversion from replication
to EC. {{ECManager}} also facilitates the lookup of {{BlockGroup}} information for block recovery
work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message