couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <antony.bla...@gmail.com>
Subject Re: Update notifications including update sequence
Date Mon, 19 Jan 2009 04:56:03 GMT

On 19/01/2009, at 2:53 PM, Paul Davis wrote:

> On Sun, Jan 18, 2009 at 10:51 PM, Antony Blakey <antony.blakey@gmail.com 
> > wrote:
>> I've previously posted a solution using _external that doesn't hit  
>> couch
>> every update, and that maintains MVCC consistency and lazy-update  
>> view
>> behaviour.
>>
>
> Right. I tried looking through mark mail for a link to your
> implementation but came up empty handed. I'd contemplated something
> similar as well. The issue though is that Lucene index writers are
> AFAIK not reentrant.

Thread 'couchdb' started by Tim Parkin around 20/21 December.

IndexWriters are mutexed using a lock file.

> Thus the headache of coordinating multiple random
> processes would start to suck. Lots.

My reading of the code was that there was a single process for each  
_external definition (although admittedly that was early in my  
understanding of gen_server). Major consistency issues result if  
requests to the _external aren't serialized.

>> The problem with using notifications is lack of snapshot coordination
>> between the update process and the external process.
>>
>
> I'd say this is use case dependent.

It does mean that you can't guarantee that an external request (that  
does reference a given MVCC snapshot) is getting data from the same  
snapshot.

You're right that's use case dependent, but the issue is whether the  
use case is 'free text indexing' or is a client use case. If the  
later, then you need to handle the situation where it *does* matter,  
so an implementation that has random characteristics is IMO less than  
optimal.

>> The synchronisation between sequential _external calls is obvious  
>> e.g.
>> guaranteeing that the _external process sees a monotonic increasing
>> update_seq.
>>
>
> I don't follow.

I mean you'll never get a request in the context of an update_seq that  
your _external process has already advanced beyond, because the  
update_seqs seen by the external are a) serialized and b) only see a  
monotonic increasing sequence of update_seq values. Hence you can  
safely run an update process and set a 'last_update_seq_seen' (which  
is the key to avoiding hitting couch again) knowing that you never  
have to backtrack.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Human beings, who are almost unique in having the ability to learn  
from the experience of others, are also remarkable for their apparent  
disinclination to do so.
   -- Douglas Adams



Mime
View raw message