hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: [DISCUSS] Remove append?
Date Sat, 24 Mar 2012 02:44:03 GMT

On 3/22/12 10:25 AM, "Eli Collins" <eli@cloudera.com> wrote:

>On Thu, Mar 22, 2012 at 1:26 AM, Konstantin Shvachko
><shv.hadoop@gmail.com> wrote:
>> Eli,
>> I went over the entire discussion on the topic, and did not get it. Is
>> there a problem with append? We know it does not work in hadoop-1,
>> only flush() does. Is there anything wrong with the new append
>> (HDFS-265)? If so please file a bug.
>> I tested it in Hadoop-0.22 branch it works fine.
>> I agree with people who were involved with the implementation of the
>> new append that the complexity is mainly in
>> 1. pipeline recovery
>> 2. consistent client reading while writing, and
>> 3. hflush()
>> Once it is done the append itself, which is reopening of previously
>> closed files for adding data, is not complex.
>I agree that much of the complexity is in #1-3 above, which is why
>HDFS-265 is leveraged.
>The primary simplicity of not having append (and truncate) comes from
>not leveraging the invariant that finalized blocks are immutable, that
>blocks once written won't eg shrink in size (which we assume today).

That invariant can co-exist with append via copy-on-write.  The new state
and old state would co-exist until the old state was not needed, a file's
block map would have to use a persistent data structure. Copy on write
semantics with blocks in file systems is all the rage these days.  Free
snapshots, atomic transactions for operations on multiple blocks, etc.

>> You mentioned it and I agree you indeed should be more involved with
>> your customer base. As for eBay, append was of the motivations to work
>> on stabilizing 0.22 branch. And there is a lot of use cases which
>> require append for our customers.
>> Some of them were mentioned in this discussion.
>From what I've seen 0.22 isn't ready for production use. Aside from
>not supporting critical features like security, it doesn't have a
>size-able user-base behind it testing and fixing bugs, etc. All things
>I'd imagine an org like eBay would want.  I've never gotten a request
>to support 0.22 from a customer.

View raw message