incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ara.t.howard" <ara.t.how...@gmail.com>
Subject Re: uuid, auto-increment, et al.
Date Mon, 20 Oct 2008 16:10:51 GMT

On Oct 19, 2008, at 11:34 PM, Antony Blakey wrote:

> 3. Use couchdb in a single-writer multiple-reader scenario. If you  
> only do that for those activities that require uniqueness then you  
> have consistency issues to deal with because replication is  
> asynchronous. One way to do that is to switch a session to the  
> writable server as soon as you need uniqueness. The single writer  
> becomes a bottleneck, but this is what I'm doing because it matches  
> my information architecture.
>
> 4. Use a central specialized server to check uniqueness and generate  
> an opaque userid token that you would subsequently use as a key (you  
> shouldn't use the username as a key). An ldap server or something  
> like it. Equivalent to the option above, but the single server only  
> needs to deal with the particular operations requiring uniqueness.  
> It's still a single point of failure, but I don't think you can get  
> around that if you want synchronous global uniqueness testing.



you know, thinking about this further is causing me to wonder if even  
that'd do it.  consider a single write master an attempting to check  
if a given login is unique, with couch there no way to ensure that two  
keys are unique other than using said key as the _id for the doc.   
it's possible that'd work in some cases, but consider

   user
     login : string (unique)
     password : string
     email: string (unique)

and it suddenly becomes difficult again - we cannot ensure both are  
unique with an atomic test/set.  we cannot simply combine them as a  
key because

   login: foo
   email: foo@barcom

and

   login: bar
   emai: foo@bar.com

would become a unique pair.  rather we'd have to do something like  
keep a master index of all docs - or perhaps just a blank doc (index)  
mapping emails to users and bulk update the pair of them together.

so there are probably solutions lurking, but all of them seem very  
complicated to allow something as simple as allowing a user to have a  
simple piece of memorable information to enter and have an application  
respond by mapping that directly to the user's data.

obviously we can solve the issue with multiple writes and reads  
combined, but that's not particularly good if other applications might  
be using this data actively (which is of course the case with a single  
writer).

to summarize the problem - what's the preferred method of ensuring  
that a field of a couchdb doc is unique across all docs if that key  
cannot be the _id itself?  is the only method to write it, process a  
view to find all docs which might have a duplicate field, and then  
possibly to delete the original?  it's all i imagine now but that's  
pretty tricky in a high concurrency situation because of the mutually  
destructive race condition what proceeds:

   a - write doc with foo=bar
   b - write doc with foo=bar

   a - read list of all docs where foo==bar
   b - read list of all docs where foo==bar

   a - decide ours was a dup.  destroy it.  report error to user
   b - decode ours was a dup.  destroy it. report error to user.

which is full of messiness if the network or db goes down but, most  
importantly, can fail even when it's up.

what am i missing when it comes to attempting to model a doc field  
which should contain a unique value?

cheers.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being  
better. simply reflect on that.
h.h. the 14th dalai lama




Mime
View raw message