couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Goodall <matt.good...@gmail.com>
Subject Re: replication feedback (was Re: [VOTE] Apache CouchDB 0.10.0 release)
Date Thu, 01 Oct 2009 16:46:42 GMT
2009/10/1 Adam Kocoloski <kocolosk@apache.org>:
> On Oct 1, 2009, at 8:57 AM, Matt Goodall wrote:
>
>> 2009/10/1 Nicholas Orr <nicholas.orr@zxgen.net>:
>>>
>>> Thanks Adam,
>>>
>>> Hmm status page huh? let me check that out :)
>>> Starting things up again and having a look....
>>>
>>> Ok the status page does show stuff happening and then it stopped
>>> updating, and just shows this
>>>
>>> Replication     70c214: http://192.168.1.11:5984/many_docs/ ->
>>> test    <0.136.0>       W Processed source update #18270
>>>
>>> Is that suppose to go away as all the docs have been replicated.
>>
>> Hi,
>>
>> I'm seeing something similar. The replication process stays in the
>> status list until couchdb is restarted, however my replication is
>> *not* completing. If I restart the controlling couchdb and the
>> replication process it continues from where it left off but often
>> stalls again. Eventually, after a number of couchdb and replication
>> restarts it reaches the end.
>>
>> In case it's useful here's the essence of the log files for a pull
>> replication:
>
> Nope, nothing useful in those logs -- everything looks normal.  Is this a
> continuous replication?  If so, it seems like it could be related to the
> earlier thread about replication terminating prematurely.  Are any proxies
> involved?

Not continuous, I might try that later.

No proxies involved. The two machines were on the same network, not
even a firewall between them in fact.

>
>> <logs snipped>
>
>> I was actually really surprised to see the _local and
>> _ensure_full_commit requests in the source database's logs. Is that
>> correct behaviour?
>
> Yes, it is correct.
>
> The _ensure_full_commit is to make sure that all replicated documents are
> safely committed to disk.  If we don't do this, we run the risk of skipping
> documents in replication in the future (if the source restarts and loses
> documents, it will reuse some update sequences, but the replicator will skip
> them because it thinks that it already replicated them).
>
> The _local document is used to record the replication checkpoints and
> history.  We record it on both the source and the target so that if the
> source DB is deleted and recreated, replication will start again from 0.

Ah, ok. Thanks for the explanation.

>
>> I'm sure someone reported problems with the replication process dieing
>> and having to restart replication until it finally gets to end but I
>> can't find the email now. To be honest, that would be fine -
>> restarting just replication is easy and non-intrusive - but needing to
>> restart the couchdb server to clear the replication is not nice.
>
> Agreed, will try to find a way to reproduce this.  Best,
>
> Adam
>

Mime
View raw message