couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Klo <>
Subject Re: Document Timestamp On Replication
Date Thu, 05 May 2011 22:59:39 GMT
Besides our use not being applicable inside the browser, I'm not sure that works for our use
case, as we explicitly don't want any updates.  Our end solution ends up being a REST service
to be used by others.

We literally want a snapshot of the view at a specific point in time and be able to operate
on that snapshot without it changing.  

A good physical analogy is I want to take a bucketful of water (documents) from a lake (couchdb)
being fed by a river (continous data inserts/updates), and empty the bucket by the spoonful
(pagination). When I'm finished emptying the bucket, I'll go back to lake and refill the bucket
(new request), the lake now has more water (documents) since I first started with the first
bucketful since it's fed by the river (continuous inserts/updates of new documents).  In this
scenario, the state of the bucket is only effected by operations done with the spoon, the
lake has no effect upon the bucket. This is very close to what we want to be able to achieve
using CouchDB.

To extend the analogy a bit to show what we do not want to occur, and are desire to prevent
[ as I believe this is close to how CouchDB actually works]: if we had a hose that would continuously
fill the bucket from the lake while we are emptying the bucket by spoonful.  We could easily
get into a state where the bucket begins to overflow or we can never actually empty the bucket
unless the process of emptying with the spoon doesn't exceed the rate at which the bucket
fills.  I can't change the rate at which bucket emptying occurs, nor can a predict or change
the rate the hose fills the bucket from the lake.  Ultimately in this extension - how do we
get rid of the hose?  It seems update_seq=true get's me part of the way there. What seems
to be missing and really what I need/want is a before=<seq id> instead of a since=<seq

Jim Klo
Senior Software Engineer
Center for Software Engineering
SRI International

On May 5, 2011, at 2:07 PM, Chris Anderson wrote:

> Not sure I follow all the requirements, but here is what I've done in the past.
> on page load: query the view with update_seq=true
> render the screen with up to date data as of seq X
> open a changes request with since=X&include_docs=true
> each doc that comes down the pipe, run the map function again (in the
> browser) and take whatever is emitted and stick it in your
> datastructure that represents the view (or just directly update the
> dom). also if an old version of the doc emitted something different,
> remove whatever stuff in your in-page representation corresponds to
> the old version of the doc.
> now you have a screen that is kept up to date with a consistent
> representation of what you'd get in a hard-reload, with a
> transactional guarantee that no updates will be skipped.
> Chris
> On Wed, May 4, 2011 at 1:21 PM, Owen Marshall <> wrote:
>> On 05/04/2011 04:13 PM, Eli Stevens (Gmail) wrote:
>>> 11:59 - Document D inserted on Node 2.  Replication hasn't happened yet.
>>> 12:00 - First access of view page 1 on Node 1.  Only A, B, C are present.
>>> 12:01 - D is replicated to Node 1.
>> Mmm, yes, you're absolutely correct; depending on that view would carry
>> with it the risk of an update race. It would (likely) work if replicates
>> were consistently low-latency, but that's not a guarantee.
>> Correct me if I'm wrong, but that view would work if:
>> 1. you capture last_seq from _changes pre-view run
>> 2. run the view, capturing the output
>> 3. check _changes for any updates since=your captured last_seq
>> 4. filter those IDs out of your captured view.
>> Yuck.
>> --
>> Owen Marshall
>> FacilityONE
>> | (502) 805-2126
> -- 
> Chris Anderson

View raw message