hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Kellerman <...@powerset.com>
Subject Re: [jira] Commented: (HADOOP-1700) Append to files in HDFS
Date Sat, 08 Sep 2007 15:17:43 GMT
On Fri, 2007-09-07 at 23:44 -0700, eric baldeschwieler (JIRA) wrote:
> eric baldeschwieler commented on HADOOP-1700:
> Yes, I'd thought of using length too.  We have had requests to support truncate as well.
> I'm on the fence on that one.  It is simple and more provably correct if we don't do

> truncates and just track block length.
> But we've a significant client group that has had an interest in truncates.

I think truncates would make the problem a whole lot more complicated
and delay the feature that we really need (appends). How about doing
appends first and document how we might do truncates if there insights
with respect to truncates that come out of the append implementation?

> > Append to files in HDFS
> > -----------------------
> >
> >                 Key: HADOOP-1700
> >                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
> >             Project: Hadoop
> >          Issue Type: New Feature
> >          Components: dfs
> >            Reporter: stack
> >
> > Request for being able to append to files in HDFS has been raised a couple of times
on the list of late.   For one example, see http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.
 Other mail describes folks' workarounds because this feature is lacking: e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480
(Later on this thread, Jim Kellerman re-raises the HBase need of this feature).  HADOOP-337
'DFS files should be appendable' makes mention of file append but it was opened early in the
life of HDFS when the focus was more on implementing the basics rather than adding new features.
 Interest fizzled.  Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation
and being able to concurrently read/write -- rather than try and breathe new life into HADOOP-337,
instead, here is a new issue focused on file append.  Ultimately, being able to do as the
google GFS paper describes -- having multiple concurrent clients 
 making 'Atomic Record Append' to a single file would be sweet but at least for a first cut
at this feature, IMO, a single client appending to a single HDFS file letting the application
manage the access would be sufficent.
Jim Kellerman, Senior Engineer; Powerset

View raw message