incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Strategy for reliable _changes feed workers
Date Mon, 05 Mar 2012 21:32:09 GMT
I'd urge caution here. The _changes feed allows the replicator to
avoid reprocessing updates that the target has already seen but,
crucially, replication is not broken if the feed includes old updates.
In BigCouch, and hence a future version of CouchDB, the changes feed
can sometimes contain rows from before the since= value, in the case
of failover to a different replica of a shard.

Clearly, in BigCouch, you could not depend on the changes feed to
ensure you process an item exactly once, so I suggest its a bad
practice to assume the same of CouchDB. Instead, I would create a view
that includes unprocessed items. Once processed (whatever that
entails), update the document to indicate it has been processed. This
will work everywhere.

B.

On 5 March 2012 21:13, Jens Alfke <jens@couchbase.com> wrote:
>
> On Mar 5, 2012, at 8:23 AM, Zachary Zolton wrote:
>
> How are you using the _changes feed for reliable background processing?
>
> Well, the _changes feed is a key part of the CouchDB replicator, which uses it exactly
as you’ve described.
>
>  * What is the last sequence number processed?
>  * Have we already attempted to process this update?
>  * How many times have we failed this update failed?
>
> The replicator stores a “checkpoint” value which is the latest sequence ID that it’s
completely processed. The logic of its full operation is pretty complex (though of course
the source code is available.)
>
> —Jens

Mime
View raw message