hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7729) Add logic to DFSOutputStream to support writing a file in striping layout
Date Thu, 12 Feb 2015 18:08:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318660#comment-14318660
] 

Zhe Zhang commented on HDFS-7729:
---------------------------------

Thanks Bo for the quick rev! The overall structure looks good to me now. I'd like more opinions
on the following questions:
# We are extending several variables to arrays or lists, like {{dataQueue}}, {{currentSeqno}},
{{streamer}}. How should we handle this extension:
#* We can have a bunch of arrays/lists in the outer {{DFSOutputStream}} class.
#* Or we can convert some of them as members of {{DataStreamer}}, like the current patch does.
Following this direction, maybe {{dataQueue}} and {{ackQueue}} can go into {{DataStreamer}}
too?
#* I guess the bottom line is to have a consistent treatment for all such variables.
# How to handle {{InterruptedException}} in {{nextBlockOutputStream}} when offering and polling
blocks to the BlockingQueue. I guess we should just gracefully close the streamer thread.
# This change is quite complex and we should think about how to minimize risk of breaking
logics for non-striping files.
#* I will setup a branch on my personal github repo and run a Jenkins job for the latest patch
#* Thinking out loud: shall we keep {{streamer}}, {{dataQueue}}, {{ackQueue}} variables, and
create *additional* arrays or lists for secondary streams? This might be less intrusive on
non-striping files. [~jingzhao] Could you share some advice here?
# Testing without a striping reader is tricky. {{blocksForUnitTest}} in the current patch
is a quick and easy way around. I feel it's probably fine in a branch commit, but would like
to hear from others.

Minor items:
# Append should use the {{notSupportedInStripingLayout}} too
{code}
    //Appending to a striping layout file will be supported in the next phase
    if(stripingLayout)
      throw new IOException("Not support appending to a striping layout file yet.");
{code}
# There are still bracket-less statements, like in {{notSupportedInStripingLayout}}

> Add logic to DFSOutputStream to support writing a file in striping layout 
> --------------------------------------------------------------------------
>
>                 Key: HDFS-7729
>                 URL: https://issues.apache.org/jira/browse/HDFS-7729
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: Codec-tmp.patch, HDFS-7729-001.patch, HDFS-7729-002.patch, HDFS-7729-003.patch,
HDFS-7729-004.patch, HDFS-7729-005.patch, HDFS-7729-006.patch, HDFS-7729-007.patch, HDFS-7729-008.patch
>
>
> If client wants to directly write a file striping layout, we need to add some logic to
DFSOutputStream.  DFSOutputStream needs multiple DataStreamers to write each cell of a stripe
to a remote datanode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message