incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <>
Subject Re: uuid, auto-increment, et al.
Date Mon, 20 Oct 2008 05:34:48 GMT
If you want to ensure that the username is unique at the time the user  
enters it, then you need a central synchronous service. Using the  
username/password as a pair isn't a good idea because it only takes  
two naive/lazy users to use a similar password (based say on their  
username? :) for collision to subsequently occur.

I've been considering this in a production environment, and I saw four  

1. Append some form of unique id from the server you are currently  
talking to i.e. checksum then machine uuid + process. Any checksum is  
going to have some chance of global collision, but it could be made  
vanishingly small. Not great for the user because they have a  
complicated username.

2. Define your user interaction such that it can deal with  
subsequently needing to add some suffix to the username e.g. when you  
get a replication conflict (which could involve an number of conflicts  
equal to the number of writable replicas), you amend some/all of the  
names to include a serial number and then email the user. This  
complicates things for the user, and they end up with a username they  
haven't chosen, or they may not see the email and end up abusing tech  
support etc etc.

3. Use couchdb in a single-writer multiple-reader scenario. If you  
only do that for those activities that require uniqueness then you  
have consistency issues to deal with because replication is  
asynchronous. One way to do that is to switch a session to the  
writable server as soon as you need uniqueness. The single writer  
becomes a bottleneck, but this is what I'm doing because it matches my  
information architecture.

4. Use a central specialized server to check uniqueness and generate  
an opaque userid token that you would subsequently use as a key (you  
shouldn't use the username as a key). An ldap server or something like  
it. Equivalent to the option above, but the single server only needs  
to deal with the particular operations requiring uniqueness. It's  
still a single point of failure, but I don't think you can get around  
that if you want synchronous global uniqueness testing.

On 20/10/2008, at 3:17 PM, ara howard wrote:

> i know counting objects, aka, distributed auto-increment in couch is  
> consider bad form.  but let me propose a scenario a feel out peoples  
> thoughts on a specific topic, in the interest in solving what i  
> think *must* be solvable problem when using couch for an actual,  
> real, live distributed system..
> so let's say we want to store something
>  login: foo
>  password: bar
> in a couchdb system, to authenticate users.  clearly, when given a  
> login, we want to lookup a given login by said login and validate a  
> password.
> so consider this a bit - we could store docs using "account- 
> #{ login }" or some other permutation of of the login name - the  
> md5.. whatever...
> this obviously isn't great - two user signing up on two different  
> nodes will cause a collision at replication time, but not at sign up  
> time, meaning it'd be nearly impossible to actually create a system  
> with multi-master nodes that would allow something as simple as user  
> signup without crazy after the fact email resolution requiring a  
> user to re-signup iff their login was a dup.
> okay, take two, let couch generate the uuid, and replication  
> proceeds as planned.  all is well.  that is, until you want to  
> authenticate a user... doing a search based on
>  emit( doc.login, doc )
> returns 14 results.  two of them have the same password.  which user  
> *is* this client logging in?
> so this seems like a real wart: replication is *useless* without a  
> better mechanism for generating uuids.  clearly we cannot expect a  
> user to login via uuid, and clearly we cannot use the login, nor  
> login:password combined as the uuid since that would create retro- 
> active signup failures...
> so, in a situation like this, requiring a unique set of data across  
> all replicating systems, what would the 'couch way' be?
> i think i'm stuck thinking inside a box and would love some insight  
> to get out of it but, for now, i feel like the distributed and  
> replicated nature of couch, while solving a host of issues, seems to  
> open up vastly more complicated ones in the process.
> kind regards.
> a @
> --
> we can deny everything, except that we have the possibility of being  
> better. simply reflect on that.
> h.h. the 14th dalai lama

Antony Blakey
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to  
make it so simple that there are obviously no deficiencies, and the  
other way is to make it so complicated that there are no obvious  
   -- C. A. R. Hoare

View raw message