couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miles Fidelman <mfidel...@meetinghouse.net>
Subject Re: massive replication?
Date Mon, 26 Oct 2009 18:21:58 GMT
Paul,

I'm actually suggesting using NNTP infrastructure, or something like it, 
to propagate updates - rather than trying to reinvent the protocol and 
supporting infrastructure in CouchDB. 

Something like: 

change -> local CouchDB instance -<continuous replication (output)> -> 
local NNTP daemon (specific newsgroup) -> "the net"

"the net" -> local NNTP daemon (specific newsgroup) -> <continuous 
replication (input) -> local CouchDB instance

All messages eventually get to all Couch instances - but the order and 
delay can vary considerably.

Miles


Paul Davis wrote:
> Miles,
>
> This sounds like what I was trying to propose. More concretely:
>
>   
>> I keep coming back to NNTP (USENET) as a model for many-to-many messaging:
>>
>> - push a message into a newsgroup on any NNTP node subscribing to that
>> newsgroup
>>     
>
> A message group is persisted in a DB. In the future, a single db with
> filtered replication could work, but a db per group will probably be
> easier. Not all nodes have all db's.
>
>   
>> - nodes exchange "I-have"/"You-have" on a regular basis
>>     
>
> Replication of the "network status db" that holds what nodes have what
> update_seq/host pairs. The pairing is important. The rate of
> replication obviously affects delays and what not.
>
>   
>> - message propagate to all subscribing nodes by essentially a flooding or
>> epidemic routing mechanism
>>     
>
> This is the fun part. Given your local copy of a group, how do you
> pick a peer to pull from. Or push to if you're feeling proactive. Do
> we sync in both directions, etc etc. Depending on the behavior this
> could manifest in multiple ways. If you have a single authority per
> 'subscription source' then you'd want to keep
> "authority/update_seq/located_at" triples. This way you know that if
> the largest update_seq seen is N, then you can replicate from any node
> that has that sequence.
>
>   
>> - pretty quickly, a message propagates to all nodes subscribing to the
>> newsgroup
>>     
>
> Message delivery via replication. Judging the quickly part would
> require a more formal definition of quickly and then building the
> thing. Depending on your definition of quickly there are definitely
> different types of design decisions to be made.
>
>   
>> - lack of connectivity simply delays message propagation
>>     
>
> Or reroutes through an alternate node. If we know that four nodes have
> a copy of the db, we can replicate from any that are alive.
> Replication is incremental, a link can disappear in the middle of a
> replication and it'll resume from the previous check point.
>
>   
>> - the whole system scales massively, and is very robust in the face of
>> connectivity outages, node failures, etc. (messages can flow across multiple
>> routes)
>>     
>
> It scales on my brain debugger assuming the propagation algorithm
> isn't extremely naïve. But as I said, I haven't built it yet so who
> knows.
>
>   
>> In some sense, what I'm thinking of would look a lot like:
>>
>> - a group of CouchDB nodes all subscribe to a newsgroup
>>     
>
> I'm confused by the term subscription here. Generally I'd assume that
> means that a foreign host knows that the local node is interested in
> something. For fully distributed awesome, I think it'd be better
> phrased as "nodes can declare what they want" and the algorithm will
> attempt to maintain their local state somewhere close to the global
> state. Kind of like the difference between newspaper delivery and
> buying a newspaper at any one of the many shops in town.
>
>   
>> - each node publishes changes as messages to that newsgroup
>>     
>
> $ curl -X PUT -d '{"message": "First post!!1!1"}'
> http://127.0.0.1:5984/newsgroup/memesrus
>
> Part of the replication routing could include a proactive step in
> pushing messages to nodes it  knows care about a message.
>
>   
>> - NNTP takes care of getting messages everywhere, eventually
>>     
>
> The 'network state in a db' means that regardless of who's interested
> in what, if we make sure that the first step in node communication is
> a 'i know about these endpoints in these states' then we'll push
> information to the people that care. Eventually.
>
>   
>> - each node looks for incoming messages and applies them as changes
>>     
>
> Replication FTW!
>
>   
>> - use a shared key to secure things (note: some implementations of NNTP
>> already support secure messaging)
>>     
>
> This is slightly harder. Read only means you'd need something infront
> of couch to prevent readers. But OAuth or distributing SSL certs or
> similar wouldn't be out of the question.
>
>   
>> A similar approach could be taken using:
>> - a distributed hash table as a message que (that's what spread and splines
>> seem to do)
>>     
>
> I haven't looked to hard at these, but "DHT as message queue" seems
> contradictory to the idea that most DHT's (all?) that I know of don't
> allow range queries.
>
>   
>> - the DIS or HLA protocols (used for distributed simulation - keeping
>> multiple copies of a "world" synchronized)
>>     
>
> I couldn't get past the first wiki page Google gave me, so, I dunno.
>
> HTH,
> Paul Davis
>   


-- 
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra



Mime
View raw message