couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <>
Subject Re: Unique instance IDs?
Date Tue, 13 Dec 2011 03:12:25 GMT
On Dec 12, 2011, at 8:25 PM, Jason Smith wrote:

> On Tue, Dec 13, 2011 at 8:03 AM, Paul Davis <> wrote:
>> Having a UUID for every database created is the ideal
>> harmonious-to-theory manifestation of "what is a db?" but we have to
>> deal with reality when people may copy a file which makes things a bit
>> weird when there are two instances of a UUID db.
> You didn't say "harsh reality," but to list some legitimate situations
> where people might copy .couch files:
> * Restoring from backups
> * Cloning a VMWare image
> * Booting an EC2 AMI
> * NAS storage clusters
> * Couchbase mobile bootstrapping
>>> There's actually no problem with moving DBs around today, except that
>>> replication starts over (unless you change host names to match).
>> The "except that replication starts over" is a very significant caveat
>> that I would say contradicts the entire "no problem" description.
> Nobody has shown that "replication starts over" is bad. The implicit
> assumption is that starting over is costly. At present, yes, that is
> true, but that's mostly a bunch of "no-op" round-trips diffing the
> revs.
> If there were a hypothetical single query which let the receiver
> assess its exact relationship to an arbitrary sender's data, I don't
> think "starts over" would sound as awful.
> -- 
> Iris Couch

Starting over is quite painful for a large target database.  Streaming _changes from the source
is cheap, but the _missing_revs / _revs_diff API call involves a bunch of random id_tree lookups
on the target.  If you've got spinning rust and IDs that don't follow the sequence numbers
these "no-op" checks basically top out around 100 IDs / sec / spindle.  Assume 50 MM documents
in the target DB and you're looking at a week of no-ops before the real replication starts
up again.  Not pretty.


View raw message