couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Johnson" <>
Subject Re: uuid, auto-increment, et al.
Date Mon, 20 Oct 2008 21:29:00 GMT
On Mon, Oct 20, 2008 at 6:34 AM, Antony Blakey <>wrote:

> If you want to ensure that the username is unique at the time the user
> enters it, then you need a central synchronous service. Using the
> username/password as a pair isn't a good idea because it only takes two
> naive/lazy users to use a similar password (based say on their username? :)
> for collision to subsequently occur.
> I've been considering this in a production environment, and I saw four
> solutions:
> 1. Append some form of unique id from the server you are currently talking
> to i.e. checksum then machine uuid + process. Any checksum is going to have
> some chance of global collision, but it could be made vanishingly small. Not
> great for the user because they have a complicated username.
> 2. Define your user interaction such that it can deal with subsequently
> needing to add some suffix to the username e.g. when you get a replication
> conflict (which could involve an number of conflicts equal to the number of
> writable replicas), you amend some/all of the names to include a serial
> number and then email the user. This complicates things for the user, and
> they end up with a username they haven't chosen, or they may not see the
> email and end up abusing tech support etc etc.
> 3. Use couchdb in a single-writer multiple-reader scenario. If you only do
> that for those activities that require uniqueness then you have consistency
> issues to deal with because replication is asynchronous. One way to do that
> is to switch a session to the writable server as soon as you need
> uniqueness. The single writer becomes a bottleneck, but this is what I'm
> doing because it matches my information architecture.

3b. Determine the shard to write to based on a hash of the key you're
inserting (or part of it, if you want multiple-document transactions to work
properly). Since every document has only a single authoritative write
server, you can ensure uniqueness/atomicity without having the bottleneck or
single point of failure of a single global master.

> 4. Use a central specialized server to check uniqueness and generate an
> opaque userid token that you would subsequently use as a key (you shouldn't
> use the username as a key). An ldap server or something like it. Equivalent
> to the option above, but the single server only needs to deal with the
> particular operations requiring uniqueness. It's still a single point of
> failure, but I don't think you can get around that if you want synchronous
> global uniqueness testing.
> On 20/10/2008, at 3:17 PM, ara howard wrote:
>> i know counting objects, aka, distributed auto-increment in couch is
>> consider bad form.  but let me propose a scenario a feel out peoples
>> thoughts on a specific topic, in the interest in solving what i think *must*
>> be solvable problem when using couch for an actual, real, live distributed
>> system..
>> so let's say we want to store something
>>  login: foo
>>  password: bar
>> in a couchdb system, to authenticate users.  clearly, when given a login,
>> we want to lookup a given login by said login and validate a password.
>> so consider this a bit - we could store docs using "account-#{ login }" or
>> some other permutation of of the login name - the md5.. whatever...
>> this obviously isn't great - two user signing up on two different nodes
>> will cause a collision at replication time, but not at sign up time, meaning
>> it'd be nearly impossible to actually create a system with multi-master
>> nodes that would allow something as simple as user signup without crazy
>> after the fact email resolution requiring a user to re-signup iff their
>> login was a dup.
>> okay, take two, let couch generate the uuid, and replication proceeds as
>> planned.  all is well.  that is, until you want to authenticate a user...
>> doing a search based on
>>  emit( doc.login, doc )
>> returns 14 results.  two of them have the same password.  which user *is*
>> this client logging in?
>> so this seems like a real wart: replication is *useless* without a better
>> mechanism for generating uuids.  clearly we cannot expect a user to login
>> via uuid, and clearly we cannot use the login, nor login:password combined
>> as the uuid since that would create retro-active signup failures...
>> so, in a situation like this, requiring a unique set of data across all
>> replicating systems, what would the 'couch way' be?
>> i think i'm stuck thinking inside a box and would love some insight to get
>> out of it but, for now, i feel like the distributed and replicated nature of
>> couch, while solving a host of issues, seems to open up vastly more
>> complicated ones in the process.
>> kind regards.
>> a @
>> --
>> we can deny everything, except that we have the possibility of being
>> better. simply reflect on that.
>> h.h. the 14th dalai lama
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
> There are two ways of constructing a software design: One way is to make it
> so simple that there are obviously no deficiencies, and the other way is to
> make it so complicated that there are no obvious deficiencies.
>  -- C. A. R. Hoare

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message