couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filipe David Manana <fdman...@gmail.com>
Subject _replicator DB
Date Wed, 19 May 2010 09:31:04 GMT
Dear all,

I've been working on the _replicator DB along with Chris. Some of you have
already heard about this DB in the mailing list, IRC, or whatever. Its
purpose:

- replications can be started by adding a replication document to the
replicator DB _replicator (its name can be configured in the .ini files)

- replication documents are basically the same JSON structures that we
currently use when POSTing to _replicate/  (and we can give them an
arbitrary id)

- to cancel a replication, we simply delete the replication document

- after the replication is started, the replicator adds the field "state" to
the replication document with value "triggered"

- when the replication finishes (for non continuous replications), the
replication sets the doc's "state" field to "completed"

- if an error occurs during a replication, the corresponding replication
document will have the "state" field set to "error"

- after detecting that an error was found, the replication is restarted
after some time (10s for now, but maybe it should be configurable)

- after a server restart/crash, CouchDB will remember replications and will
restart them (this is specially useful for continuous replications)

- in the replication document we can define a "user_ctx" property, which
defines the user name and/or role(s) under which the replication will
execute



Some restrictions regarding the _replicator DB:

- only server admins can add and delete replication documents

- only the replicator itself can update replication documents - this is to
avoid having race conditions between the replicator and server admins trying
to update replication documents

- the above point implies that to change a replication you have to add a new
replication document

All this restrictions are in replicator DB design doc -
http://github.com/fdmanana/couchdb/blob/replicator_db/src/couchdb/couch_def_js_funs.hrl<http://github.com/fdmanana/couchdb/blob/_replicator_db/src/couchdb/couch_def_js_funs.hrl>


The code is fully working and is located at:
http://github.com/fdmanana/couchdb/tree/replicator_db

It includes a comprehensive JavaScript test case.

Feel free to try it and give your feedback. There are still some TODOs as
comments in the code, so it's still subject to changes.


For people more involved with CouchDB internals and development:

That branch breaks the stats.js test and, occasionally, the
delayed_commits.js tests.

It breaks stats.js because:

- internally CouchDB uses the _changes API to be aware of the
addition/update/deletion of replication documents to/from the _replicator
DB. The _changes implementation constantly opens and closes the DB (opens
are triggered by a gen_event). This affects the stats open_databases and
open_os_files.

It breaks delayed_commits.js  occasionally because:

- by listening to _replicator DB changes an  extra file descriptor is used
which affects the max_open_dbs config parameter. This parameter is related
to the max number of user opened DBs. This causes the error {error,
all_dbs_active} (from couch_server.erl) during the execution of
delayed_commits.js (as well as stats.js).

I also have another branch that fixes these issues in a "dirty" way:
http://github.com/fdmanana/couchdb/tree/_replicator_db  (has a big comment
in couch_server.erl explaining the hack)

Basically it doesn't increment stats for the _replicator DB and bypasses the
max_open_dbs when opening _replicator DB as well as doesn't allow it to be
closed in favour of a user requested DB (like it assigned it a +infinite LRU
time to this DB).

Sometimes (although very rarely) I also get the all_dbs_active error when
the authentication handlers are executing (because they open the _users DB).
This is not originated by my _replicator DB code at all, since I get it with
trunk as well.

I would also like to collect feedback about what to do regarding this 2
issues, specially max_open_dbs. Somehow I feel that no matter how many user
DBs are open, it should always be possible to open the _replicator DB
internally (and the _users DB).


cheers


-- 
Filipe David Manana,
fdmanana@gmail.com

"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message