couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Alfke <j...@couchbase.com>
Subject Re: [REVIEW] CouchDB Replication Protocol
Date Tue, 12 Nov 2013 01:05:56 GMT
Thanks, Alexander.

How would you like us to edit the source? I’ve never worked with collaborative editing using
gists. Does it work like source editing where I fork it and send you a pull request? In general
it needs a lot of proofreading for English grammar.

In general I have a feeling this whole process is a case of “the best being the enemy of
the good”. Something short like my original document would have been fine, IMHO. It’s
nice that you put so much effort into this, but it’s now a very long document that’s going
to take a long time to review.


 If you add section numbers, it would be easier to discuss the document since we could refer
to a specific section unambiguously.

Diagrams: Will these be redone as graphics? The ASCII art is kind of hard to read.

Sample REST request/responses: These have some very long lines that aren’t wrapped in the
HTML (at least not in my browser, Safari 7.) That makes them hard to read.

In the “Verify Peers” section:
You’re making the assumption that the local database is accessible via HTTP using the CouchDB-compatible
REST API, which isn’t necessarily true. It’s only true of Couchbase Lite if optional components
are installed. I don’t know if it’s true of PouchDB. In general, an embedded database
will be using a more direct native API to provide access to local data.

In the “Get Peers Information” section:
It’s not explained why the “instance_start_time” field is needed or what it’s used
for. (I am still not sure why it’s needed; I just found by trial and error that CouchDB
expected it.)
The description of what “update_seq” is for is vague; I couldn’t figure out what you
meant. Also, its value will not necessarily be a number — in BigCouch or Cloudant it’s
an opaque string, I think.

In the “Retrieve Replication Logs” section:
The format of these logs is implementation-specific, since the only thing that ever reads
a log is the replicator that created it. You’ve described the format CouchDB uses, but for
instance the data stored by Couchbase Lite is entirely different.
I think it’s still useful to show what CouchDB stores, but you don’t need to specify it
in such detail, and it should be prefaced as being only one possible way to store the data.

“Listen Changes Feed”:
• The ‘feed’ parameter does not have to be ‘continuous’. This parameter has nothing
to do with continuous replication; it’s just a similar name. For instance, Couchbase Lite
only uses ‘longpoll’ since the continuous feed doesn’t work reliably with telco proxy
servers.
• Does the CouchDB replicator really use a heartbeat of 10 seconds?! That seems really short,
and will generate a lot of traffic for a server with a large number of replication clients.
(10000 mobile clients connected = 1000 heartbeats sent per second. Yikes!)
• The ‘since’ parameter shouldn’t be set to 0. If you don’t have a value, just omit
it. Again, not all servers use numeric values.
• "Reading whole feed with single shot may be not resource optimal solution and it is RECOMMENDED
to process feed by chunks.” This is an implementation specific detail; it might be true
of CouchDB but isn’t of other implementations. I don’t think it’s appropriate for this
document. If you’re talking about chunks for the _revs_diff request that follows, that’s
independent of the changes feed.

…ok, that’s as much as I have time for today!

—Jens

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message