couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: Replication Chatter / Recovery
Date Thu, 24 Jun 2010 15:43:49 GMT
Hey Cory,
Sounds like a really interesting project. This is the kind of creative use
case couch should be well suited to.

Replication makes frequent checkpoints, so there should be little in the way
of duplicate data transfer.

As far as overhead, there are only fixed sources, which should be close to
linear in the number of documents, not bytes. The exact overhead depends on
whether a push or pull replication is used.

A proxy on each end that (de)compresses the http bodies should be simple. I
don't think couch supports this itself for replication yet, but it's a
reasonable request if you would like to file a ticket for it.

If you'd like to get couch set up, I'm sure folks on this list would be
happy to help you capture and measure the replication overhead if no one has
already.

Regards,
Randall

On Jun 24, 2010 9:05 AM, "Cory Zue" <czue@dimagi.com> wrote:

Hi there,

My team is designing a distributed health data capture system to be
used in rural Africa, and we are planning to use CouchDB as a back end
for it's excellent replication features.

One concern I had was how the replication would perform over a very
unreliable internet connection.  Is replication done in pieces or does
it require large amounts of data to make it through at a single time?
If the connection goes down in the middle of replication is the result
that you have to start over from the beginning or is it smart enough
to recover what has already made it across the wire?

Also, are there any numbers I can get on how chatty replication is?
Our system will likely be deployed with post-paid SIM cards and GSM
modems providing the internet connection in many sites, so I would
like to be able to get a rough estimate of data usage.  Is there any
formula I could use, such as "syncing X bytes of data in couch causes
K * X bytes to go over the wire (where K i some overhead amount)".
Seeing how JSON probably compresses quite well, is there any way to do
compressed synchronization?

thanks in advance,
Cory

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message