couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Replication vs. Compaction
Date Tue, 18 Feb 2014 11:38:46 GMT

On 31 Jan 2014, at 20:08 , Jason Smith <jhs@apache.org> wrote:

> Anyway I think the broader point is, compaction is for compacting databases
> (removing old document revisions), and replication is for making a copy of
> a database (or subset). If compaction is causing downtime then that is a
> different bug to talk about, but it should be totally transparent.
> 
> Jens (incidentally it's nice to talk with you again): the compactor will
> notice that it has not caught up yet, and it will run again from the old
> "end" to the real end. Of course, there may be changes during that run too,
> so it will repeat. Usually each iteration has a much, much smaller window.
> In practice, you tend to see one "not caught up" message in the logs, and
> then it's done.



> However there is a pathological situation where you are
> updating faster than the compactor can run, and you will get an infinite
> loop (plus very heavy i/o and filesystem waste as the compactor is
> basically duplicating your .couch into a .couch.compact forever).

Just a little clarification on this point: CouchDB will try to catch up,
I think 10 times, before giving up and reporting the result in the logs.

Best
Jan
-- 



> 
> 
> On Sat, Feb 1, 2014 at 12:59 AM, Jens Alfke <jens@couchbase.com> wrote:
> 
>> 
>> On Jan 31, 2014, at 9:46 AM, Mark Hahn <mark@hahnca.com> wrote:
>> 
>>> It wouldn't matter if it did.  Within the same server linux
>> short-circuits
>>> http to make it the same as unix sockets, i.e. very little overhead.
>> 
>> I think you mean it short-circuits TCP :)
>> There's extra work involved in HTTP generation & parsing no matter what
>> transport you're sending it over. And then the replicator is doing a bunch
>> of JSON and multipart generation/parsing.
>> Whereas the compactor, I would imagine, is mostly just making raw
>> read/write calls while walking the b-tree.
>> 
>> Anyway; this makes me wonder what happens when changes are made to a
>> database during compaction. The compaction processes working off of a
>> snapshot of the database from the point that it started, so it's not going
>> to copy over new changes. Does that mean they get lost, or does the
>> compactor have extra smarts to run a second phase where it copies over all
>> revs created since the snapshot?
>> 
>> —Jens


Mime
View raw message