hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files
Date Fri, 27 Mar 2015 18:53:53 GMT

     [ https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Zhe Zhang updated HDFS-7889:
    Attachment: HDFS-7889-003.patch

Thanks Bo for the rev!

bq. super.writeChunk just writes a chunk(typically 512 bytes) to currentPacket, a packet is
typically about 64K bytes, so currentPacket may not be null if it is not full.
I see, thanks for clarifying

bq. I think DataStreamer should not be aware of the existence of other streamers
This is a good thought. A generic {{DataStreamer}} should indeed be unaware of other streamers.
However, in the EC context, it would simplify the code a lot if each streamer has an index.
Then they can simply coordinate with each other through the queue of located blocks.

I think we can subclass {{DataStreamer}} to achieve this. By subclassing we can also avoid
adding the queue of located blocks in {{DataStreamer}}, which doesn't make sense for a generic
streamer. I'm attaching a new patch to illustrate the idea (the logic is similar as the HDFS-7729
patch). Please let me know your opinion on this.

Some other comments:
# From a technical perspective I'm OK with having a factory class for output streams. However
that will lead to another big change to {{DFSOutputStream}}, which unfortunately is not suitable
to push to trunk. This will potentially cause a lot of merge conflicts. Maybe we can just
use a simple if statement in {{DFSClient}}?
# The logic {{unwrapBlockGroup}} is needed by both input and output streams. Let's leave it
here now. In HDFS-7782 I can add a common utility class and move the logic there.

> Subclass DFSOutputStream to support writing striping layout files
> -----------------------------------------------------------------
>                 Key: HDFS-7889
>                 URL: https://issues.apache.org/jira/browse/HDFS-7889
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, HDFS-7889-003.patch
> After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing striping layout

This message was sent by Atlassian JIRA

View raw message