hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Bo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files
Date Wed, 15 Apr 2015 03:50:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495614#comment-14495614

Li Bo commented on HDFS-7889:

hi, Zhe, please see my following explanation of the related code.

The first(leading) streamer is responsible for committing block groups. Before committing,
the first streamer needs to wait for other streamers to finish writing their blocks and then
count the total number of bytes written in this block group. Because streamers only share
{{stripedBlocks}}, when an ordinary streamer finish writing its block, it has to report its
work to leading streamer. It sends a LocatedBlock object(containing how many bytes it has
written for its block) to the blocking queue of leading streamer(i.e.{{stripedBlocks\[0\]}}).
The leading streamer will wait for the queue and collect other streamers' report. The ordinary
streamer can just send an Integer to the leading streamer, here I choose LocatedBlock is because
it may be more convenient to do error handling in HDFS-7786.

bq. hasCommittedBlock is initially false. But once becoming true, it will never be false again.
What's the purpose of this flag?
For an ordinary streamer, it send its report to leading streamer in {{endBlock}} when it finishes
writing a block.
For the leading streamer, at first he just request a block group from NN. When it has to request
another block group, it has to commit the old one. So {{hasCommittedBlock}} will be true after
the first request.

bq. Why are we always polling the first located block, instead of the i_th?
{{stripedBlocks.get(0)}} is the blocking queue of the leading streamer, it needs to get the
results of other streamer’s work before committing the block group to NN.

bq. Shouldn't we always commit block.getNumBytes() * NUM_DATA_BLOCKS?
The size of last block group may be smaller than {{block.getNumBytes() * NUM_DATA_BLOCKS}},
{{StripedDataStreamer#countTrailingBlockGroupBytes()}} is used to count the written bytes
of last block group. For previous full block group, the leading streamer has to wait for the
slowest streamer to finish writing. Otherwise, if the leading streamer commits {{block.getNumBytes()
* NUM_DATA_BLOCKS}} bytes to NN before slow streamers, and one streamer fails after that,
the error handling will be complicated.

The above solution may be not the best but it works by now. If you have a better solution,
we can discuss and optimize the related logic.

> Subclass DFSOutputStream to support writing striping layout files
> -----------------------------------------------------------------
>                 Key: HDFS-7889
>                 URL: https://issues.apache.org/jira/browse/HDFS-7889
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>             Fix For: HDFS-7285
>         Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, HDFS-7889-003.patch, HDFS-7889-004.patch,
HDFS-7889-005.patch, HDFS-7889-006.patch, HDFS-7889-007.patch, HDFS-7889-008.patch, HDFS-7889-009.patch,
HDFS-7889-010.patch, HDFS-7889-011.patch, HDFS-7889-012.patch, HDFS-7889-013.patch, HDFS-7889-014.patch
> After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing striping layout

This message was sent by Atlassian JIRA

View raw message