couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: documentation of replication protocol?
Date Wed, 21 Apr 2010 13:46:06 GMT
On Apr 21, 2010, at 12:05 AM, J Chris Anderson wrote:

> 
> On Apr 20, 2010, at 7:29 PM, Miles Fidelman wrote:
> 
>> Hi Folks,
>> 
>> I've been looking, but can't seem to find any good documentation of the inter-node
protocol used for replication.
>> 
>> I've been thinking of playing with a multi-cast alternative to the current pair-wise
replication model - but, of course, that's hard to do without visibility into the format of
the messages exchanged during replication.
>> 
>> Miles Fidelman
> 
> As far as I know, the best source for documentation is the code, right now.
> 
> My reservation about the replication protocol is that it is more brittle than JSON (it
requires some exact string matches in the source). With an event-based JSON parser, we could
accept any valid JSON instead of hard coding the output of replicators.
> 
> One thing that strikes me is that if we had a browser-based test for the replicator protocol,
we could clean this up substantially. This test suite would be a great contribution from anyone
out there wanting to learn the replicator really well, but you might need to collaborate with
someone to help get the tests to pass, in places.
> 
> This is the hard coding (in Ruby) I had to add, to used the CouchDB replicator to pull
from the Booth server:
> 
> http://github.com/jchris/booth/commit/2deff74e03838a6e7ef95b725c4342a08239a2b8#commitcomment-68685
> 
> This is fine if we're just trying to replicate between CouchDB instances, but a challenge
for people building interoperable replicators. 
> 
> Chris

Hi Chris, I need a little clarification here.  Was the hack on line 57 the specific placement
of newlines, the ordering of fields in the JSON Object, or something else?

The CouchDB replicator does use a regular expression to split the _changes feed into individual
events.  If you're talking about the need for newlines in between events, yes, that was a
silly oversight on our part, and a simple bugfix.

The requirement for "last_seq" to appear after "results" in the object is also a simple thing
to fix.  There's no good reason for the replication protocol to be more brittle than JSON.

Adam
Mime
View raw message