incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <>
Subject Re: How fast do CouchDB propagate changes to other nodes?
Date Sun, 19 Dec 2010 00:14:25 GMT
On Sat, Dec 18, 2010 at 4:46 PM, Randall Leeds <> wrote:
> On Sat, Dec 18, 2010 at 04:00, Robert Dionne
> <> wrote:
>> On Dec 17, 2010, at 6:07 PM, Randall Leeds wrote:
>>> keeping cluster information and database metadata up to date around
>>> the cluster, but that information tends to be small and changes
>>> infrequently.
>>> However, to me this sounds like a lot of work for something that might
>>> be better solved using technologies like zeromq, particularly if
>>> logging all messages is optional.
>>> Anyway, I'm happy to talk about all of this further since I think it's
>>> kind of fascinating. I've been thinking a lot recently about how flood
>> I'm curious, is flood replication what the name implies? Broadcasting?
> I'll throw this at dev@, too.
> Yes, broadcasting.
> I've been thinking about alternative checkpoint schemes that take the
> source and destination host out of the equation and figure out some
> other way to verify common history. I imagine it's going to have to
> involve a hash tree.
> With the ability to resolve common history without having *directly*
> exchanged checkpoints, hosts could receive incremental update batches
> from different hosts if the replication graph changes over time.
> Anyway, it's just a little infant of a thought, but I think it's a
> good one to have in our collective conscious.
> Randall

Random off the top of my head response:

I don't see anything immediately following from what you describe.
Even if you had a way of saying "I already have this revision" there's
no real way to figure out where to start once you get rid of the
src/dst/seq triplet (that I can think of).

Though an interesting observation is that replication never really
delete's anything in a history. As a quick optimization that could
lead to where you're wanting to go, you may check out storing a bloom
filter for the database that stores a hash of the docid/rev pair for
all incoming edits. Then the replicator could use that to speedup
replication when its already got edits from the source db.

Assuming you update that filter in real time and can update in
progress replications, you should be able to get interesting patterns
of edits moving through a cluster.

Or something to that effect.


View raw message