couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Questions about couchDB algorithms
Date Wed, 04 Feb 2009 15:02:43 GMT
Don't forget the link that Ulises posted:

http://horicky.blogspot.com/2008/10/couchdb-implementation.html

There's a nice description of Replication there too.

On Wed, Feb 4, 2009 at 9:46 AM, Alessio Pace <alessio.pace@gmail.com> wrote:
> Hi,
>
> On Wed, Feb 4, 2009 at 3:38 PM, Paul Davis <paul.joseph.davis@gmail.com>wrote:
>
>> On Wed, Feb 4, 2009 at 9:18 AM, Alessio Pace <alessio.pace@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > On Wed, Feb 4, 2009 at 2:59 PM, Paul Davis <paul.joseph.davis@gmail.com
>> >wrote:
>> >
>> >> On Wed, Feb 4, 2009 at 4:49 AM, Alessio Pace <alessio.pace@gmail.com>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > I have just discovered couchDB and I'm very interested in knowing more
>> >> about
>> >> > the internal details of it, because apart from reading that is
>> >> multi-master
>> >> > and that the system is eventually consistent, I don't see much
>> >> informations
>> >> > about other various key design things (I apologize if I wasn't able
to
>> >> find
>> >> > them), like:
>> >> >
>> >> > - update propagation through gossiping: based on any known algorithm?
>> >> >
>> >>
>> >> The current state of affairs in terms of keeping multiple nodes in
>> >> sync uses the baked in replication mechanism. At the moment it is up
>> >> to the user to ensure that nodes are kept in sync via these
>> >> facilities.
>> >
>> >
>> > Could you formulate a bit more about this?
>> >
>>
>> Hmm. There doesn't appear to be much in the way of documentation on
>> replication beyond this:
>>
>> http://wiki.apache.org/couchdb/Frequently_asked_questions#how_replication
>
>
> Yes I saw that unfortunately there is not much, I would have liked some
> informations on how replication is done more in detail.
>
>
>>
>>
>> Replication is basically a triggered async mechanism for ensuring that
>> all updates on node A are on node B (assuming replication from A -> B
>> obviously). It's incremental in operation, so repeatedly replicating
>> will only send new changes etc.
>>
>> >
>> >> There is quite a bit of active development on this front
>> >> so it's best to stay tuned and see what comes out.
>> >>
>> >> > - group membership among the various site: how is it done, through
>> >> > gossiping? If so, based on any known algorithm?
>> >> >
>> >>
>> >> Not sure what you mean here.
>> >
>> >
>> > I mean: how does it deal with dynamic networks, where nodes join and
>> leave,
>> > and you have to know onto which you can push/pull ? I am obviously
>> talking
>> > about cases in which you can't list the a priori list of few cluster
>> > machines on text file and copy it on all the machines.
>> >
>>
>> Replication is asynchronous and triggered. There's no constant
>> connections or anything of that nature. If you did pull replication
>> for everything then there'd be no issues other than on rejoining the
>> network a node may require a bit of time to catch up with the current
>> state of things.
>>
>> >
>> >>
>> >>
>> >> > - are sites required to accept incoming connections (-> can I
>> replicate
>> >> on
>> >> > nodes behind NAT?)
>> >> >
>> >>
>> >> Replication is both push and pull. So you just need to be able to have
>> >> your nodes behind NAT know when to trigger replication.
>>
>
>> >
>> >
>> > You mean that if a public node generates an update, it can't be
>> replicated
>> > to a natted target node unless it pulls the replication itself?
>> >
>>
>> Either that or you'll need to setup port forwarding as in all things NAT.
>
>
>
> Yes, unless other less straighforeward techniques are employed.
>
> Thanks.
> Regards,
> --
> Alessio Pace
>

Mime
View raw message