couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Questions about couchDB algorithms
Date Wed, 04 Feb 2009 14:38:06 GMT
On Wed, Feb 4, 2009 at 9:18 AM, Alessio Pace <alessio.pace@gmail.com> wrote:
> Hi,
>
> On Wed, Feb 4, 2009 at 2:59 PM, Paul Davis <paul.joseph.davis@gmail.com>wrote:
>
>> On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <alessio.pace@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > I have just discovered couchDB and I'm very interested in knowing more
>> about
>> > the internal details of it, because apart from reading that is
>> multi-master
>> > and that the system is eventually consistent, I don't see much
>> informations
>> > about other various key design things (I apologize if I wasn't able to
>> find
>> > them), like:
>> >
>> > - update propagation through gossiping: based on any known algorithm?
>> >
>>
>> The current state of affairs in terms of keeping multiple nodes in
>> sync uses the baked in replication mechanism. At the moment it is up
>> to the user to ensure that nodes are kept in sync via these
>> facilities.
>
>
> Could you formulate a bit more about this?
>

Hmm. There doesn't appear to be much in the way of documentation on
replication beyond this:

http://wiki.apache.org/couchdb/Frequently_asked_questions#how_replication

Replication is basically a triggered async mechanism for ensuring that
all updates on node A are on node B (assuming replication from A -> B
obviously). It's incremental in operation, so repeatedly replicating
will only send new changes etc.

>
>> There is quite a bit of active development on this front
>> so it's best to stay tuned and see what comes out.
>>
>> > - group membership among the various site: how is it done, through
>> > gossiping? If so, based on any known algorithm?
>> >
>>
>> Not sure what you mean here.
>
>
> I mean: how does it deal with dynamic networks, where nodes join and leave,
> and you have to know onto which you can push/pull ? I am obviously talking
> about cases in which you can't list the a priori list of few cluster
> machines on text file and copy it on all the machines.
>

Replication is asynchronous and triggered. There's no constant
connections or anything of that nature. If you did pull replication
for everything then there'd be no issues other than on rejoining the
network a node may require a bit of time to catch up with the current
state of things.

>
>>
>>
>> > - are sites required to accept incoming connections (-> can I replicate
>> on
>> > nodes behind NAT?)
>> >
>>
>> Replication is both push and pull. So you just need to be able to have
>> your nodes behind NAT know when to trigger replication.
>
>
> You mean that if a public node generates an update, it can't be replicated
> to a natted target node unless it pulls the replication itself?
>

Either that or you'll need to setup port forwarding as in all things NAT.

>
> Thank you.
>
> Best regards,
> --
> Alessio Pace
>

HTH,
Paul Davis

Mime
View raw message