couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <>
Subject Re: massive replication?
Date Mon, 26 Oct 2009 15:57:39 GMT

One the one hand, it sounds like you could solve this with XMPP and
use CouchDB as a backing store. People have already connected RabbitMQ
and CouchDB, I can't imagine that connecting ejabberd and CouchDB
would be much harder. The pubsub extensions could very much be what
you're wanting.

Though, that doesn't let you write awesome distributed routing
protocols. So, really, what fun could that be?

I'm basically thinking something along the lines of:

1. Initial peer discovery is seeded or auto discovered from multicast
messages. Or some combination thereof depending on your network
2. Nodes gossip information they've found through replication with
nodes they know about. Something like, create a database that contains
network connections and replicate that along side with your data. A
view on this database would calculate each subscription end point's
highest known update_seq and who has it.
3. Each subscription end point is a database that can be replicated.

There are at least two hard parts I can spot right now:

1. Replication currently has no endpoint for detection. We don't
expose a list of _local/docs to anything so detecting when a node is
replicating too us would be harder. Adding a custom patch to alert a
db_update_notification process with _local/doc updates would be fairly

2. The fun part of replication strategy in trying to maximize network
throughput, avoid stampeding a single updater, etc etc. There's enough
literature on this to get a general idea of where to go, but it'd
still probably take some tuning and experimentation. But with plenty
of monitoring and some foresight, giving your self the ability to
tweak system parameters and measure the effects would be quite a lot
of fun. ... Holy crap, I'm contemplating writing the basic routing
logic in Python/Ruby and replicating it out to the network, then as
each node gets a copy the router replaces its logic dynamically. Now
that'd be just plain awesome.

Note that this sort of protocol doesn't attempt to avoid bandwidth
usage by multi-casting documents which may be a non-starter depending
on your topology and characteristics. Instead the idea is to figure
out how to play with your network topology to react and match your
physical link network yet with enough redundancy that it doesn't break
as physical links come and go.

Granted, that's all off the top of my head, so until I sit down and
implement that on a couple hundred nodes it may well be a bunch of hot
air. :)

Paul Davis

On Mon, Oct 26, 2009 at 11:33 AM, Miles Fidelman
<> wrote:
> Adam Kocoloski wrote:
>> On Oct 26, 2009, at 10:45 AM, Miles Fidelman wrote:
>>> The environment we're looking at is more of a mesh where connectivity is
>>> coming up and down - think mobile ad hoc networks.
>>> I like the idea of a replication bus, perhaps using something like spread
>>> ( or spines ( as a multi-cast fabric.
>>> I'm thinking something like continuous replication - but where the
>>> updates are pushed to a multi-cast port rather than to a specific node, with
>>> each node subscribing to update feeds.
>>> Anybody have any thoughts on how that would play with the current
>>> replication and conflict resolution schemes?
>>> Miles Fidelman
>> Hi Miles, this sounds like really cool stuff.  Caveat: I have no
>> experience using Spread/Spines and very little experience with IP
>> multicasting, which I guess is what those tools try to reproduce in
>> internet-like environments.  So bear with me if I ask stupid questions.
>> 1) Would the CouchDB servers be responsible for error detection and
>> correction?  I imagine that complicates matters considerably, but it
>> wouldn't be impossible.
> Good question.  I hadn't quite thought that far ahead.  I think the basic
> answer is no (assume reliable multicast), but... some kind of healing
> mechanism would probably be required (see below).
>> 2) When these CouchDB servers drop off for an extended period and then
>> rejoin, how do they subscribe to the update feed from the replication bus at
>> a particular sequence?  This is really the key element of the setup.  When I
>> think of multicasting I think of video feeds and such, where if you drop off
>> and rejoin you don't care about the old stuff you missed.  That's not the
>> case here.  Does the bus store all this old feed data?
> Think of something like RSS, but with distributed infrastructure.
> A node would publish an update to a specific address (e.g., like publishing
> an RSS feed).
> All nodes would subscribe to the feed, and receive new messages in sequence.
>  When picking up updates, you ask for everything after a particular sequence
> number.  The update service maintains the data.
>> 3) Which steps of the replication do you envision using the replication
>> bus?  Just the _changes feed (essentially a list of docid:rev pairs) or the
>> actual documents themselves?
> Any change to a local copy of the database (i.e., everything).
>> The conflict resolution model shouldn't care about whether replication is
>> p2p or uses this bus.  Best,
> Thanks,
> Miles
> --
> In theory, there is no difference between theory and practice.
> In practice, there is.   .... Yogi Berra

View raw message