incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Cottlehuber <...@jsonified.com>
Subject Re: huge attachments - experience?
Date Tue, 26 Mar 2013 10:59:12 GMT
On 25 March 2013 22:44, Jens Alfke <jens@couchbase.com> wrote:
>
> On Mar 25, 2013, at 1:13 PM, svilen <az@svilendobrev.com<mailto:az@svilendobrev.com>>
wrote:
>
> As i don't really need more than 1 version back, i'm playing with idea
> of using couchdb for that. Either putting the files as attachments, or
> if not possible, using it as filesystem-miming synchronised metadata,
> with appropriate listeners reacting on changes (like rename, mv, etc).

+1 to all Jens & Nils said with 2 more points.

If you store only metadata in couch, using a hash like md5 of the data
instead of the actual filename, then using that to point to the stored
files on disk is quite attractive. Renames, moves, are all internal to
couchdb as the data hasn't changed. It will deduplicate itself as well
if you have multiple copies (e.g. revisions of docs).

The down side of putting stuff outside couch is that you need to
manage the things you get for free:

- easy replication model
- deletion handling (how many docs have this file, should I delete
this file now because the document attachment was deleted, etc)
- streaming of data from within couchdb
- inbuilt compression
- keeping replication partners in sync (I don't need this doc anymore
but the others don't yet have the updated copy type problems, esp in
mesh replication topology)

The other nasty thing about attachments in couch is that during
replication, if there is a failure we can't restart part-way through.
And as they're stored directly on disk, we duplicate that waste on
both the network, and in storage inside the DB file. This may or may
not be a problem for your use case.

A+
Dave

Mime
View raw message