hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3107) HDFS truncate
Date Tue, 04 Nov 2014 02:44:37 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195607#comment-14195607
] 

Colin Patrick McCabe commented on HDFS-3107:
--------------------------------------------

Hi Konstantin,

Yes, I was aware of HDFS-7056 and had commented about it earlier.  However, when you announced
on this JIRA: "So, but the new patch from Plamen Jeliazkov already have snapshot support,"
the patch that I checked was {{HDFS-3107.patch}}, not a patch on a different JIRA.  After
all, [~zero45] is the assignee on this JIRA as well as HDFS-7056, so "the new patch from Plamen
Jeliazkov" could refer to either patch.  It seems logical to assume that what is being discussed
on a particular JIRA is the patch attached to that JIRA, unless specified otherwise.

I think this highlights one confusing thing: that HDFS-3107 is now an umbrella JIRA as well
as a JIRA with a (non-rollup) patch.  This creates confusion in people's minds because questions
like "is HDFS-3107 done?" become ambiguous.  It could be interpreted as either "is the patch
on HDFS-3107 done?" or "is the feature discussed in HDFS-3107 done?"  To remove this confusion,
I created the HDFS-7341 subtask to implement the part we've been discussing here.  That is,
the pipeline-recovery based solution which the other subtasks build on top of.

I would like to create the HDFS-3107 branch as soon as possible.  I would have already created
it, but as per my comment above, I want to make sure you are not objecting.  The quicker we
can get this stuff into the branch, the quicker we can get this feature polished and merged
into trunk.

As I commented earlier, it's not up to me to determine if something gets into 2.6 or not.
 It's up to the release manager (currently [~acmurthy]) and the community to vote.  Since
2.6 is so far along (it should have been released weeks ago, if the original schedule had
been followed), I doubt that most people will welcome a big new feature getting added at this
stage.  I don't think branch versus since commit matters in this regard-- a big new feature
is a big new feature.  It's not going to "sneak in the back door"-- nor should it, given the
problems we've had in the past (like with HDFS append).  We have time to do a thorough review.
 In the meantime, distributions that are already shipping a variant of append can continue
to do so, knowing that eventually the feature has a path to mainline.

Let me know if you have any objections to creating the branch, otherwise I'll do it tomorrow.
 Thanks

> HDFS truncate
> -------------
>
>                 Key: HDFS-3107
>                 URL: https://issues.apache.org/jira/browse/HDFS-3107
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Lei Chang
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, HDFS-3107.patch,
HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch,
HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf,
HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the underlying storage
when a transaction is aborted. Currently HDFS does not support truncate (a standard Posix
operation) which is a reverse operation of append, which makes upper layer applications use
ugly workarounds (such as keeping track of the discarded byte range per file in a separate
metadata store, and periodically running a vacuum process to rewrite compacted files) to overcome
this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message