couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: replication feedback (was Re: [VOTE] Apache CouchDB 0.10.0 release)
Date Thu, 01 Oct 2009 16:15:00 GMT
On Oct 1, 2009, at 8:57 AM, Matt Goodall wrote:

> 2009/10/1 Nicholas Orr <nicholas.orr@zxgen.net>:
>> Thanks Adam,
>>
>> Hmm status page huh? let me check that out :)
>> Starting things up again and having a look....
>>
>> Ok the status page does show stuff happening and then it stopped
>> updating, and just shows this
>>
>> Replication     70c214: http://192.168.1.11:5984/many_docs/ ->
>> test    <0.136.0>       W Processed source update #18270
>>
>> Is that suppose to go away as all the docs have been replicated.
>
> Hi,
>
> I'm seeing something similar. The replication process stays in the
> status list until couchdb is restarted, however my replication is
> *not* completing. If I restart the controlling couchdb and the
> replication process it continues from where it left off but often
> stalls again. Eventually, after a number of couchdb and replication
> restarts it reaches the end.
>
> In case it's useful here's the essence of the log files for a pull  
> replication:

Nope, nothing useful in those logs -- everything looks normal.  Is  
this a continuous replication?  If so, it seems like it could be  
related to the earlier thread about replication terminating  
prematurely.  Are any proxies involved?

> <logs snipped>

> I was actually really surprised to see the _local and
> _ensure_full_commit requests in the source database's logs. Is that
> correct behaviour?

Yes, it is correct.

The _ensure_full_commit is to make sure that all replicated documents  
are safely committed to disk.  If we don't do this, we run the risk of  
skipping documents in replication in the future (if the source  
restarts and loses documents, it will reuse some update sequences, but  
the replicator will skip them because it thinks that it already  
replicated them).

The _local document is used to record the replication checkpoints and  
history.  We record it on both the source and the target so that if  
the source DB is deleted and recreated, replication will start again  
from 0.

> I'm sure someone reported problems with the replication process dieing
> and having to restart replication until it finally gets to end but I
> can't find the email now. To be honest, that would be fine -
> restarting just replication is easy and non-intrusive - but needing to
> restart the couchdb server to clear the replication is not nice.

Agreed, will try to find a way to reproduce this.  Best,

Adam

Mime
View raw message