From Adam Kocoloski <>
Subject Re: silent view index file corruption
Date Tue, 13 Apr 2010 03:12:09 GMT
On Apr 7, 2010, at 2:05 PM, Robert Newson wrote:

> Being able to start a consistency checking task (that shows up in
> _active_tasks, etc) would be useful. I don't think it's something that
> ought to happen automatically, though. Possibly in the case where
> corruption is actually detected?
> I'd like;
> a) checksums on *everything* (don't care for crc32 vs fnv1a vs md5 vs
> sha1 debate, anything is massively better than nothing).
> b) ability to launch end-to-end verification of all checksums (with
> progress if possible).
> c) ability to store/retrieve verification checkpoints.
> Further to c), it would be useful to have API access to replication
> checkpoints. Specifically so that we can interrogate one couchdb
> instance and get some idea for how up to date it is with respect to
> its replicas (which may be unreachable or offline).

Yes, we certainly need more visibility into the replication checkpoint system.  I've been
thinking that the _replicator DB Chris has graciously volunteered to work on should automatically
add a checkpoint_id field or some such thing to documents that are saved in there.  That way
you can have all manner of nice views on the _replicator DB that identify the _local doc you
need to pull to inspect the current status of a replication.

Back to the corruption ... after I spent some time reviewing CouchDB's fsync settings I'm
wondering why we don't fsync the view index file before writing a header.  I understand why
we don't do it after the header write -- we can always just reindex the last few updates --
but the pre-sync would preserve some write ordering and possibly prevent some sources of corruption.


