hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Collins <...@cloudera.com>
Subject Re: [DISCUSS] Remove append?
Date Wed, 21 Mar 2012 17:32:51 GMT
Thanks for the feedback Milind, questions inline.

On Wed, Mar 21, 2012 at 10:17 AM,  <Milind.Bhandarkar@emc.com> wrote:
> As someone who has worked with hdfs-compatible distributed file systems
> that support append, I can vouch for its extensive usage.
>
> I have seen how simple it becomes to create tar archives, and later append
> files to them, without writing special inefficient code to do so.
>

Why not just write new files and use Har files, because Har files are a pita?

> I have seen it used in archiving cold data, reducing MR task launch
> overhead without having to use a different input format, so that the same
> code can be used for both hot and cold data.
>

Can you elaborate on the 1st one, how it's especially helpful for archival?

I assume the 2nd one refers to not having to Multi*InputFormat. And
the 3rd refers to appending to an old file instead of creating a new
one.

> In addition, the small-files problem in HDFS forces people to write MR
> code, and causes rewrite of large datasets even if a small amount of data
> is added to it.

Do people rewrite large datasets today just to add 1mb? I haven't
heard of that from big users (Yahoo!, FB, Twitter, eBay..) or my
customer base.  If so I'd would have expected people to put energy
into getting append working in 1.x which know was has put energy into
(I know some people feel the 20-based design is unworkable, I don't
know it well enough to comment there).

Thanks,
Eli

>
> So, there is clearly a need for it, AFAIK.
>
> +1 on fixing it. Please let me know if you need help.
>
> - milind
>
> ---
> Milind Bhandarkar
> Greenplum Labs, EMC
> (Disclaimer: Opinions expressed in this email are those of the author, and
> do not necessarily represent the views of any organization, past or
> present, the author might be affiliated with.)
>
>
>
> On 3/21/12 5:36 AM, "Dave Shine" <Dave.Shine@channelintelligence.com>
> wrote:
>
>>I am not a contributor to this project, so I don't know how much weight
>>my opinion carries.  But I have been hoping to see append become stable
>>soon.  We are constantly dealing with the "small file problem", and I
>>have written M/R jobs to periodically roll up lots of small files into a
>>few small ones.  Having append would prevent me from needing to use up
>>cluster resources performing these tasks.
>>
>>Therefore, all things being equal I +1 making append work.  However, if
>>the level of complexity is as bad as Eli implies below, then I can
>>understand that perhaps it is not worth the effort. If it will cause too
>>much technical debt, then removing it makes sense.  But don't just remove
>>it because you don't believe there is a need for it.
>>
>>Thanks,
>>Dave Shine
>>
>>
>>-----Original Message-----
>>From: Eli Collins [mailto:eli@cloudera.com]
>>Sent: Tuesday, March 20, 2012 8:38 PM
>>To: hdfs-dev@hadoop.apache.org
>>Subject: [DISCUSS] Remove append?
>>
>>Hey gang,
>>
>>I'd like to get people's thoughts on the following proposal. I think we
>>should consider removing append from HDFS.
>>
>>Where we are today.. append was added in the 0.17-19 releases
>>(HADOOP-1700) and subsequently disabled (HADOOP-5224) due to quality
>>issues. It and sync were re-designed, re-implemented, and shipped in
>>21.0 (HDFS-265). To my knowledge, there has been no real production use.
>>Anecdotally people who worked on branch-20-append have told me they think
>>the new trunk code is substantially less well-tested than the
>>branch-20-append code (at least for sync, append was never well tested).
>>It has certainly gotten way less pounding from HBase users.
>>The design however, is much improved, and people think we can get hsync
>>(and append) stabilized in trunk (mostly testing and bug fixing).
>>
>>Rationale follows..
>>
>>Append does not seem to be an important requirement, hflush was. There
>>has not been much demand for append, from users or downstream projects.
>>Because Hadoop 1.x does not have a working append implementation (see
>>HDFS-3120, the branch-20-append work was focused on sync not getting
>>append working) which is not enabled by default and downstream projects
>>will want to support Hadoop 1.x releases for years, most will not
>>introduce dependencies on append anyway. This is not to say demand does
>>not exist, just that if it does, it's been much smaller than security,
>>sync, HA, backwards compatbile RPC, etc. This probably explains why, over
>>5 years after the original implementation started, we don't have a stable
>>release with append.
>>
>>Append introduces non-trivial design and code complexity, which is not
>>worth the cost if we don't have real users. Removing append means we have
>>the property that HDFS blocks, when finalized, are immutable.
>>This significantly simplifies the design and code, which significantly
>>simplifies the implementation of other features like snapshots,
>>HDFS-level caching, dedupe, etc.
>>
>>The vast majority of the HDFS-265 effort is still leveraged w/o append.
>>The new data durability and read consistency behavior was the key part.
>>
>>GFS, which HDFS' design is based on, has append (and atomic record
>>append) so obviously a workable design does not preclude append.
>>However we also should not ape the GFS feature set simply because it
>>exists. I've had conversations with people who worked on GFS that regret
>>adding record append (see also
>>http://queue.acm.org/detail.cfm?id=1594206). In short, unless append is a
>>real priority for our users I think we should focus our energy elsewhere.
>>
>>Thanks,
>>Eli
>>
>>The information contained in this email message is considered
>>confidential and proprietary to the sender and is intended solely for
>>review and use by the named recipient. Any unauthorized review, use or
>>distribution is strictly prohibited. If you have received this message in
>>error, please advise the sender by reply email and delete the message.
>>
>

Mime
View raw message