hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout
Date Thu, 12 Feb 2015 22:33:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319136#comment-14319136

Zhe Zhang commented on HDFS-7729:

Thanks [~jingzhao] and [~szetszwo] for the in-depth thought. Hope the following analysis helps
exploring the {{DataStreamer}} refactor.

Logically this JIRA needs to do 3 main things:
# Extend {{DFSOutputStream#streamer}} to have multiple {{streamers}}
# In {{DFSOutputStream#writeChunk}}, add the logic of distributing / striping packets to different
# In {{DataStream#nextBlockOutputStream}}, extend the {{addBlock}} logic to allocate block
groups and give the individual blocks to peer streamers.

A standalone {{DataStreamer}} class will work easily with #1 and #2. To handle #3, we just
need to move the {{locateFollowingBlock}} logic to {{DFSOutputStream}}. 

Subclassing {{DFSOutputStream}} is a good idea and it does seem feasible if we separate out
{{DataStreamer}}. We keep a single {{streamer}} variable representing the _current streamer_.
Step #2 above should take care of updating its value to the next streamer when reaching striping
cell boundary.

> Add logic to DFSOutputStream to support writing a file in striping layout 
> --------------------------------------------------------------------------
>                 Key: HDFS-7729
>                 URL: https://issues.apache.org/jira/browse/HDFS-7729
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: Codec-tmp.patch, HDFS-7729-001.patch, HDFS-7729-002.patch, HDFS-7729-003.patch,
HDFS-7729-004.patch, HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, HDFS-7729-008.patch
> If client wants to directly write a file striping layout, we need to add some logic to
DFSOutputStream.  DFSOutputStream needs multiple DataStreamers to write each cell of a stripe
to a remote datanode. 

This message was sent by Atlassian JIRA

View raw message