couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <>
Subject Re: 1.0.0 wishlist/roadmap
Date Wed, 03 Dec 2008 11:53:40 GMT

On 3 Dec 2008, at 12:47, Volker Mische wrote:

> An additional feature would be that you can return any arbitrary  
> JSON to
> the view that will be attached to the resulting document. An example
> would be returning a distance between a point specified in the query  
> and
> a geometry in a document.

as opposed to the "rank" the protocol uses now which is "limited" to  


> Damien Katz wrote:
>> Here is some stuff I'd like to see in a 1.0.0 release. Everything is
>> open for discussion.
>> - Built-in reduce functions to avoid unnecessary JS overhead -
>> Count, Sum, Avg, Min, Max, Std dev. others?
>> - Restrict database read access -
>> Right now any user can read any database, we need to be able to  
>> restrict
>> that at least on a whole database level.
>> - Replication performance enhancements -
>> Adam Kocoloski has some replication patches that greatly improve
>> replication performance.
>> - Revision stemming: It should be possible to limit the number of
>> revisions tracked -
>> By default each document edit produces a revision id that is tracked
>> indefinitely. This guarantees conflicts versus subsequent edits can
>> always be distinguished in ad-hoc replication, however the forever
>> growing list of revisions isn't always desirable. THis can be  
>> addressed
>> by limiting the number tracked and purging the oldest revisions. The
>> downside is that if the revision tracking limited is N, then anyone  
>> who
>> hasn't replicated a document since its last N edits will see a  
>> spurious
>> edit conflict.
>> - Lucene/Full-text indexing integration -
>> We have this working to in side patches, this needs to be  
>> integrated to
>> trunk and with the view engine
>> - Incremental document replication -
>> We need at the minimum the ability to incrementally replicate only  
>> the
>> attachments that have changed in a document. This will save lots of
>> network IO and CouchDB can be version control system with document  
>> diffs
>> added as attachments.
>> This can work for document fields too, but the overhead may not be  
>> worth
>> it.
>> - Built-in authentication module(s) -
>> The ability to host a CouchDB database used for HTTP authentication
>> schemes. If storing passwords, they would need to be stored  
>> encrypted,
>> decrypted on demand by the authentication process.
>> - View server enhancements (stale/partial index option) -
>> Chris Anderson has a side branch for this we need to finish and put  
>> into
>> trunk.
>> - View index compaction -
>> Views indexes expand forever, and need to be compacted in a similar  
>> way
>> the storage files are compacted. This work will tie into the View  
>> Server
>> enhancements.
>> - Document integrity/deterministic revid -
>> For the sake of end to end document integrity, we need a way to  
>> hash a
>> document's contents, and since we already have revision ids, I  
>> think the
>> revision ids should be the hashes. The hashed document should be a
>> canonical json representation, and it should have the _id and _rev
>> fields in it. The _rev will be the PREVIOUS revision ID/hash the  
>> edit is
>> based on, or blank if a new edit. Then the _rev is replaced with  
>> the new
>> hash value.
>> - Fully tail append writes -
>> CouchDB uses zero-overwrite storage, but not fully tail append  
>> storage.
>> Document json bodies are stored in internal buffers, written
>> consecutively, one after another until the buffers in completely  
>> full,
>> then another buffer is created at the end of the file for more
>> documents. File attachments are written to similar buffers as well.
>> Btree updates are always tail append, each update to a btree, even if
>> its a deletion, causes new writes to the end of the file. Once the
>> document, attachments and indexes are commited (fsync), the header is
>> then written and flushed to disk, and that is always stored right  
>> at the
>> beginning of the file (requiring another seek).
>> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full
>> committal and index consistency. This is true if you write 1 or 1000
>> documents in a single transaction (bulk update), you still need ~ 3
>> seeks. Using conventional transaction journalling, it's possible to  
>> get
>> the committal down to a single seek and fsync, and worry about  
>> ensuring
>> file and index consistency asynchronously, often in batch mode with
>> other committed updates. This can perform very well, but has  
>> downsides
>> like extra complexity and increased memory usage as data is cached
>> waiting to be flushed to disk, and must do special consistency checks
>> and fix-ups on startup if there is a crash.
>> If CouchDB used tail-append storage for everything, then all document
>> updates can be completely flushed with full file consistency with a
>> single seek and, depending on the file system, a single fsync. All  
>> the
>> disk updates, documents, file attachments, indexes and file header,
>> occur as appends to the end of the file.
>> The biggest changes will be in how file attachments and the headers  
>> are
>> written and read, and the performance characteristics of view  
>> indexing
>> as documents will no longer be packed into contiguous buffers.
>> File attachment will be written in chunks with the last chunk being  
>> an
>> index to the other chunks.
>> Headers will be specially signed blocks written to the end of the  
>> file.
>> Reading the header on database open will require scanning the file  
>> from
>> the end, since the file might have partial updates that didn't  
>> complete
>> since the last update.
>> The performance of the views will be impacted as the documents are  
>> more
>> likely to be fragmented across the storage file. But they will  
>> still be
>> in the order they will be accessed for indexing, so the read seeks  
>> are
>> always moving forward. Also, the act of compacting the storage file  
>> will
>> result in the documents being tightly packed again.
>> - Streaming document updates with attachment writes -
>> Using mime mulitpart encoding, it should be possible to send all  
>> parts
>> of a document in a single http request, with the json and binary
>> attachments sent as different mime parts. Attachments can be  
>> streamed to
>> disk as bytes are received, keeping total memory overhead to a  
>> minimum.
>> Attachments can also be written to disk in compressed format and  
>> served
>> over http by default in that compressed format, using 0% CPU for
>> compression at read time, but will require decompression if the  
>> client
>> doesn't support the compression format.
>> - Partitioning/Clustering Support -
>> Clustering for failover and load balancing is priority. Large  
>> database
>> support via partitioning may not make 1.0

View raw message