hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9040) Erasure coding: A BlockGroupDataStreamer to rule all internal blocks streamers
Date Mon, 14 Sep 2015 03:21:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742842#comment-14742842
] 

Walter Su commented on HDFS-9040:
---------------------------------

Jing's proposal looks great. Thanks for the effort.
bq. The direction here is to make sure there is no overlap between different error handling
efforts and the new block allocation.
1. Totally agree. In HDFS-8383 I try to make 2 error-handling not overlap. My method is simply
restart another round of (updateBlockForPipeline, updatePipeline). Your method decouples them,
you restart {{updateBlockForPipeline}} many times and call {{updatePipeline}} one time in
the end. So, At first, I'll merge HDFS-8383.01.patch into BlockGroupDataStreamer. Then I'll
try replace it with your method.

2. And I never thought we shouldn't overlap error-handling with new-block-allocation as well.
Your method is to postpone it. That's great.

3. The reason I prefer not to do {{locateFollowingBlock}} in DFSOutputStream is, DFSOutputStream
is async with DataStreamer. DFSOutputStream shouldn't block during new-block-allocation. (Well,
it blocks when dataQueue congested)

bq. The complicated part is, when a streamer#0 ends, you can't bump GS for it.
4. You forgot this issue. DataStreamer wait {{ackQueue}} to be empty before it close blockStream.
With {{BlockGroupDataStreamer}} I can make 9 internal streamers to wait for error-handling
to be finished, until then I put empty_last_packet to all 9 internal streamers to let them
close blockStreams. ( It slows down the fastest streamer. That's a trade-off.)

5. It's great you did streamer replacement. We can make HDFS-8704 very easy.

> Erasure coding: A BlockGroupDataStreamer to rule all internal blocks streamers
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-9040
>                 URL: https://issues.apache.org/jira/browse/HDFS-9040
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Walter Su
>            Assignee: Walter Su
>         Attachments: HDFS-9040.00.patch, HDFS-9040.001.wip.patch
>
>
> A {{BlockGroupDataStreamer}} to communicate with NN to allocate/update block, and {{StripedDataStreamer}}
s only have to stream blocks to DNs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message