couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <>
Subject Re: Can I guarantee uniqueness in a field without using _id?
Date Wed, 14 Jan 2009 01:39:18 GMT

On 14/01/2009, at 11:40 AM, ara.t.howard wrote:

> assume your system will be distributed at some point and that global  
> uniqueness is hard whether in a rdbms or docms.  even auto-increment  
> will fail here.  now take the middle path - only writes need to be  
> unique because, if all of them are, objects can distribute between  
> nodes freely (since we already know they are unique inside the  
> system).
> so just make a rule - you will require a single couchdb instance to  
> be available to check uniqueness, but any couchdb instance up to  
> perform reads or writes which do not need to be unique (like  
> comments with a timestamp or something).
> so host a couchdb at something like
> use this to check the uniqueness of fields.  for instance, so see if  
> a name is in the system try to put
>  { '_id' : 'name::yourname' }
> or something you app will use to decide the uniqueness of a name in  
> the system.  think of it as a global hash you can check for a given  
> key.
> to me this seems quite reasonable, it's a nod to the fact that  
> distributed uniqueness is very hard, while at the same time  
> accepting that most applications are difficult to build without it.   
> the constraint on your system will simply be that the uniqueness  
> server must be up for some operations to succeed, but if you design  
> for it many will be able to perform many other operations in the  
> face of a uniqueness server failure.

I am doing a similar thing - many read-replicas and one write replica.  
My domain tolerates unavailability of write-access, but not read- 
access. The application is distributed with each couch instance, and  
as soon as a user logs on with write access, they are switched to a  
unique write server.

I switch the user on logon to the write server, for all access,  
because otherwise they don't get any locally consistent view. Hence I  
don't distinguish between data-needing-uniqueness or not.

For each domain object, I determine the name used to identify it  
within it's parent context e.g. a user is an email address, a page in  
a CMS is a parent id + name. I then use that identity, either with a  
prefix (e.g. user:antony.blakey@...) or in a separate db, as the  
document id. As soon as I use a prefix on any data in a given db, I  
use a prefix on all data in that db that uses domain keys on ids.

As an aside, for hierarchic data (which covers any heterogenous data  
with a qualified name/uniqueness constraint), if you have a single  
write point then you can assign a UUID to each node when it is  
created, and presuming that field is immutable, you can then qualify  
the child node using the parent's UUID rather than the parent's id  
(which can grow in a nasty fashion).

Antony Blakey
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

The greatest challenge to any thinker is stating the problem in a way  
that will allow a solution
   -- Bertrand Russell

View raw message