couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin R. Coombes" <>
Subject Re: Unique instance IDs?
Date Thu, 02 Feb 2012 15:43:32 GMT
I don't think it's such a slam-dunk, but maybe I'm guilty of a 
goal-tending violation.

It depends on whether you intend to keep replicating the two copies to 
keep them synchronized.  If you want that (and most of the time, I do), 
then I think it's an advantage to keep the same "UUID".  The (matching) 
internal UUIDs indicates that you think of these two things as the same 
abstract entity.  The full URL distinguishes the "physical" copies.

If on the other hand, you expect the copies to evolve into something 
differently over time, then I can see how you might want different 
UUIDs.  But again, the full, easy-to-construct URL distinguishes them.

As far as I can tell, the existing system still lets you have it both 

On 2/2/2012 8:45 AM, Robert Newson wrote:
> ... until you copy the database (and its uuid) and have two databases
> with the same uuid. This has always been the slam-dunk argument
> against database uuid's.
> B.
> On 2 February 2012 09:41, Kevin R. Coombes<>  wrote:
>> For CouchDB, I think UUIDs are clearly the way to go.  After all, given the
>> UUID, database,  and hostname, you can construct the desired URL directly by
>> forming
>>     http://hostname:5984/database/UUID
>> As Noah points out, if you used this entire URL as the identifier (by which
>> I assume he means the _id field), then you would lose the ability to copy
>> the document elsewhere.  This would, of course, break replication
>> completely.
>> Keeping the UUIDs as they are gives the best of both worlds.  Easy
>> replication, and (as long as the database is hosted at the same place) an
>> easy way for humans and programs to construct stable URIs or URLs that point
>> to each document.
>>     -- Kevin
>> On 1/22/2012 12:44 PM, Noah Slater wrote:
>>> Sorry to bump this old thread, but just going through my backlog.
>>> With regard to URLs, I think there is some confusion about the purpose of
>>> a
>>> URL here.
>>> If I write a a cool essay, say, and I stick that up at
>>>, then I can link to it from other places on the
>>> web using that address. I might also want to put my cool essay on Dropbox,
>>> or post it to Tumblr, or send it in an email. Now my cool essay has lots
>>> of
>>> URLs. Each one of them perfectly valid. I don't have to go and edit the
>>> original copy at, because I am making copies of
>>> it. My cool essay is completely unaware of the URLs that are being used to
>>> point to it. And it doesn't care that many URLs point to it.
>>> Yes, URLs can be used as identifiers. But when you do this, you tie the
>>> thing you're naming to the place you're hosting it. Sometimes that is
>>> useful, other times it will cripple you. There is nothing about URLs that
>>> requires you to do this. I would hazard a guess that 99% of URLs are
>>> de-coupled from the things they point to. WebArch is much more robust when
>>> the identity of the object is de-coupled from the URL. Look at Atom, the
>>> ID
>>> element is supposed to be a URL, but they recommend a non-dereferencable
>>> format, precisely to decouple posts from the location you happen to be
>>> hosting them this month.
>>> Hey, if we're gonna use URLs, maybe we want to go down the same route?
>>> At this point, I'm not sure what they buy us over UUIDs.
>>> Thoughts?
>>> Thanks,
>>> N

View raw message