couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Stockton <chrisstockto...@gmail.com>
Subject Thoughts on server wide replication
Date Wed, 25 May 2011 19:23:29 GMT
I was thinking if there was a server wide replication we could support
many more users. Currently we are at a few thousand and we are
starting to feel just the expense of all of the TCP connections and
replication tasks, the calls to status to monitor that they are
running etc are getting very expensive and noticeable.

It would seem to me that a API for server wide replication would
greatly benefit our use patterns, and I'm sure anyone else who scales
through many databases (One database, is one customer).

Here is a few ideas for such a feature, throwing this out here just to
see if it sparks interest.

We will call this API _replicate_server for example purposes, name
could be subject to discussion.

To begin server wide replication:
  curl -vX POST http://localhost:5984/_replicate_server -d
'{"source":"example-database","target":"http://example.org/example-database"}'
    -> {"ok": true, <... other details>}

To begin server wide replication with a filtering function, here maybe
we can return either FALSE to not replicate, TRUE to replicate, then
an array of filters to use a filtering function? this could be simple
or very robust
  function(dbName, req) {
    return s.indexOf("my_interesting_dbs_prefix") == 1;
  }

  curl -vX POST http://localhost:5984/_replicate_server -d
'{"source":"example-database","target":"http://example.org/example-database",
"filter": "filters/server_filter"}'
    -> {"ok": true, <... other details>}

To begin server wide replication for a array of dbs:
  curl -vX POST http://localhost:5984/_replicate_server -d
'{"source":"example-database","target":"http://example.org/example-database",
"database_names": ["db_1", "db_2" ..., "db_3050"]}'
    -> {"ok": true, <... other details>}

Other params for request:
  "persistent": true|false - should this replication job persist
through couchdb restart, maybe this adds a entry to the config file or
something?
  "continuous": true|false - do a one time pass of all dbs or not,
defaulting to true makes sense, but is inconsistent with _replicate,
maybe just not support 1 time passes? my specific use cases don't
require it but I don't want to just speak for myself.

Just some thoughts from my last 1-2years or so experience with couchdb
and my use patterns. If we could trim down and improve replication
usability a bit I think couchdb could greatly benefit as a project.
Right now having to tell replication to start, having to make sure it
runs on restart (I know changes are coming/implemented for this of
some sort), and monitoring your databases to make sure they are up to
date is just a bit too much for the app tier to do and scares away
DBA's from embracing the technology as much I think.

Overall I love couchdb and find it to be a great product and has fit
our needs very well.

-Chris

Mime
View raw message