couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: Unique instance IDs?
Date Wed, 14 Dec 2011 08:55:39 GMT
On Tue, Dec 13, 2011 at 20:08, Alex Besogonov <alex.besogonov@gmail.com> wrote:
> On Mon, Dec 12, 2011 at 10:26 PM, Paul Davis
> <paul.joseph.davis@gmail.com> wrote:
>>> * Merkle trees are great for two-way synchronization, but it's not immediately
clear to me how you'd use them to bootstrap a single source -> target replication.  I
might just be missing a straightforward extension of the tech here.
>> This is the point that's important with checksums and so on. Merkle
>> trees are great when you want to mirror structured data but CouchDB
>> replication is a mirror operation. Think, N db's replicating to a
>> central DB. you have a mixture of things which breaks checksums (or at
>> least any obvious application I can think of given our internal
>> structures)
> Uhm. What are the things that break checksums? Right now revision IDs
> are _almost_
> deterministic and it's not that hard to make them completely
> deterministic. And for
> replication purposes nothing else matters.
>
> To be exact: the only entity used for ID generation is '[Deleted,
> OldStart, OldRev, Body, Atts2]'
> tuple and only 'Atts2' field can be non-deterministic. And that can be
> fixed (with other minor
> forward-looking features like explicit versioning).
>
> Then it's easy to devise a protocol to replicate based on hash trees.
> I'm thinking about
> this protocol:
> 1) The current state of replicated database is identified by a hash.
> Suppose that we
> have unidirectional replication A->B.
>
> Let's denote state of the initial database A as A1 and B's as B1.
>
> We store the ancestry as a list of hashes outside database (so it
> doesn't influence the
> hash of the database).
>
> 2) As the first step B sends its list of replication ancestry.
>
> It's actually not even required to send the whole hashes each time,
> just send the first
> 4 bytes of each hash. That way even 1 million records of replication
> history would take
> only 4Mb. The 'A' server then replies with its own set of hashes with
> the matching
> initial bytes. If there are none, then the client falls back to the
> usual replication.
>
> So at this step 'B' knows the most recent common ancestor and requests
> the changes
> that have happened since that point of time. Each changeset, naturally, has its
> own hash.
>
> 3) After these changes are applied and merged, B's state is the A1
> state plus all the
> B's changes that might have happened ever since. Then B stores the hashes
> of the changesets that have been applied.
>
> That's it. Should work, as far as I see (it's 3am local time, so I
> might miss something).
>
> Overhead: 16 bytes for the hash information for each changeset.

I think you miss the point that was made above about mirrors, still,
unless I misunderstand. B may have other changes interleaves with
those received from A, whether from interactive updates or other
replications, making its hashes different.

Mime
View raw message