incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: How fast do CouchDB propagate changes to other nodes?
Date Sun, 19 Dec 2010 01:27:29 GMT
On Sat, Dec 18, 2010 at 8:23 PM, Randall Leeds <randall.leeds@gmail.com> wrote:
> On Sat, Dec 18, 2010 at 16:41, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> Right, I probably jumped a couple steps there:
>>
>> The unique datums we have to work with here are the _id/_rev pairs.
>> Theoretically (if we ignore rev_stemming) the ordering with which
>> these come to us is roughly unimportant.
>>
>> So the issue with our history (append only) is that there's no real
>> way to order it such that we can efficiently seek through it to see
>> what we have in common (that I can think of). Ie, replication still
>> needs a way to say "I only need to send these bits". Right now its the
>> src/dst/seq triple that lets us zip through only new edits.
>>
>> Well, theoretically, we could keep a merkle tree of all edits we've
>> ever seen and go that way, but that'd require keeping a history of
>> every edit ever seen which could never be removed.
>>
>> Granted this is just quick thinking. I could definitely be missing
>> something clever.
>>
>
> We're on the same page. I don't have anything clever yet either.
> The only other thing that's crossed my mind is some way to exchange
> information about checkpoints each participant has with a third party.
> You'd have to somehow verify that the checkpoint being presented to
> you is actually one created by the third party, which involves trust
> or verification. I like the verification route because I'd still love
> to decouple the endpoint from its hostname, the idea that I was
> stabbing quite horribly at when I prematurely proposed a couple
> patches to give databases uuids. But back to the point, something like
> "you got a bunch of edits since last we spoke, but I got these edits
> from this other endpoint, are they the same ones?" Even then, I'm not
> sure how this works without the merkle.
>

Your last bit there was exactly my idea about the bloom filter. Just
populate the filter with hashes of the uniquely identifying bits of an
edit and then send the filter around.

Mime
View raw message