hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
Date Fri, 14 Mar 2014 21:44:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935707#comment-13935707
] 

Tsz Wo Nicholas Sze commented on HDFS-6087:
-------------------------------------------

> 1. A block cannot be read by others while under construction, until it is fully written
and committed. ...

It also does not support hflush.

> 2. Your proposal (if I understand it correctly) will potentially lead to a lot of small
blocks if appends, fscyncs (and truncates) are used intensively. ...

I guess it won't lead to a lot of small block since it does copy-on-write.  However, there
is going to be a lot of block coping if there are a lot of append, hsync, etc.

----
In addition, I think it would be a problem for reading the last block: If a reader opens a
file and reads the last block slowly, then a writer reopen the file for append and committed
the new last block.  The old last block may then be deleted and becomes not available to the
read anymore.

> Unify HDFS write/append/truncate
> --------------------------------
>
>                 Key: HDFS-6087
>                 URL: https://issues.apache.org/jira/browse/HDFS-6087
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Guo Ruijing
>         Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can be reopened
for append. This design will introduce complexity including lease recovery. If we design HDFS
block as immutable, it will be very simple for append & truncate. The idea is that HDFS
block is immutable if the block is committed to namenode. If the block is not committed to
namenode, it is HDFS client’s responsibility to re-added with new block ID.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message