couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roger Binns <>
Subject Re: [jira] Commented: (COUCHDB-623) File format for views is space and time inefficient - use a better one
Date Wed, 13 Jan 2010 22:54:29 GMT
Hash: SHA1

Chris Anderson wrote:
> see: for Banking
> without MVCC views, there's no way to query accurately at all when
> inserts are underway (short of blocking reads during writes).

I am afraid I do not understand what you are saying.  Sure the scheme listed
in the book makes sense, but only if a transaction maps exactly to one
document (which I guess is the point).  Even then I still don't see the
relevance.  Things would only break down if the view returned partial
information (eg if a single document caused two view rows to be emitted but
only one of those was returned.) BTW views do not return the update_seq so
as an end user you still do not know up to date it is.

The file format does not need to protect each view row, but does need to do
so for the main database where the unit is a document.

For example the view file format could use an atomic unit of 10,000
document's view output or some number of megabytes.  That unit can still be
regenerated if something bad happened (a rare circumstance such as untimely
power failure).

> If you need something with less consistency, you are encouraged to
> wrap your own indexing system around couchdb's map reduce runtime, or
> even build your own runtime.

I am becoming very tempted to just dump CouchDB for SQLite with a trivial
REST front end, since it appears that CouchDB is just not capable of
handling 10million documents/2GB of data in anything resembling a sensible
amount of disk space or compute time for the foreseeable future.

Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla -


View raw message