hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [DISCUSS] Remove append?
Date Thu, 22 Mar 2012 17:25:29 GMT
On Thu, Mar 22, 2012 at 1:26 AM, Konstantin Shvachko
<shv.hadoop@gmail.com> wrote:
> Eli,
> I went over the entire discussion on the topic, and did not get it. Is
> there a problem with append? We know it does not work in hadoop-1,
> only flush() does. Is there anything wrong with the new append
> (HDFS-265)? If so please file a bug.
> I tested it in Hadoop-0.22 branch it works fine.
> I agree with people who were involved with the implementation of the
> new append that the complexity is mainly in
> 1. pipeline recovery
> 2. consistent client reading while writing, and
> 3. hflush()
> Once it is done the append itself, which is reopening of previously
> closed files for adding data, is not complex.

I agree that much of the complexity is in #1-3 above, which is why
HDFS-265 is leveraged.
The primary simplicity of not having append (and truncate) comes from
not leveraging the invariant that finalized blocks are immutable, that
blocks once written won't eg shrink in size (which we assume today).

> You mentioned it and I agree you indeed should be more involved with
> your customer base. As for eBay, append was of the motivations to work
> on stabilizing 0.22 branch. And there is a lot of use cases which
> require append for our customers.
> Some of them were mentioned in this discussion.

>From what I've seen 0.22 isn't ready for production use. Aside from
not supporting critical features like security, it doesn't have a
size-able user-base behind it testing and fixing bugs, etc. All things
I'd imagine an org like eBay would want.  I've never gotten a request
to support 0.22 from a customer.


View raw message