couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: Replication vs. Compaction
Date Mon, 03 Feb 2014 18:28:10 GMT
On Jan 31, 2014, at 3:43 PM, Jens Alfke <jens@couchbase.com> wrote:

> On Jan 31, 2014, at 12:07 PM, Boaz Citrin <bcitrin@gmail.com> wrote:
> 
>> But if replication only copies the leaf then it makes sense that it is
>> fatser, at least on the same machine. Instead of balancing a tree it just
>> copies a single revision.
> 
> Um, no. The copied revision has to be inserted into the tree on the target database.
Worse, the target database is assumed to be 'live' during the whole process, so its tree can't
be updated as efficiently as during a replication, where the new database file isn't going
to be used at all until the whole procedure finishes.
> 
> Sorry to pull rank, but while I haven't worked on CouchDB itself, I've written 1½ CouchDB-compatible
replicators, and I've worked on a C-based compactor for CouchDB-format b-tree files. I'm pretty
sure that compaction is a lot faster. There's just much less work that it has to do.
> 
> I agree with Jason that you probably need a faster server (or disk) that will let you
compact effectively.
> 
> —Jens

Agreed, and also worth pointing out that we've developed a compactor that is far more efficient
than the one in master.  It uses less I/O and generates a smaller file to boot:

https://git-wip-us.apache.org/repos/asf?p=couchdb-couch.git;a=commit;h=5d3753d0662cfa676fdf65d0a543be205499ec11

Hopefully we can land it soon.  Regards,

Adam
Mime
View raw message