couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: CouchDB 1.0 work
Date Wed, 30 Apr 2008 20:19:28 GMT
Heya Ted,
we definitely want to do a 0.8.0 before going 1.0.
See http://mail-archives.apache.org/mod_mbox/incubator-couchdb-dev/200804.mbox/%3c3651E215-81DC-44A1-93A2-8DF6BD9378A1@gmail.com%3e

  ff.
for details.

Summary: Wait for cmlenz to get back home :)

Cheers
Jan
--

On Apr 30, 2008, at 22:11, Ted Leung wrote:
> What about trying to make a 0.8 release from the ASF repository?  Or  
> would you rather do this starting at 1.0?
>
> Ted
>
> On Apr 28, 2008, at 9:27 AM, Damien Katz wrote:
>
>> Here are my thoughts on what we need for before we can get to  
>> CouchDB 1.0. Feedback please.
>>
>> Must have:
>>
>> Incremental reduce: Maybe single biggest outstanding work item.  
>> Probably 2 weeks of development to get to a testable state
>>
>> Security/Document validation: We need a way to control who can  
>> update what documents and to validate the updates are correct. This  
>> is absolutely necessary for offline replication, where replicated  
>> updates to the database do not come through the application layer.
>>
>> View index compaction/management: View indexes currently just grow,  
>> need a compaction similar to storage compaction. Also, there is no  
>> way to purge old unused indexes, except via the OS.
>>
>> File sync problem: file:sync(), a call that flushes all uncommitted  
>> writes to disk before returning, doesn't work fully or at all on  
>> all some platforms (usually we just lack the flags to tell the OS  
>> to write to disk). Should be fixable by either patching the  
>> existing Erlang driver source, or using a replacement file driver.
>>
>> Optimizations. Right now HTTP overhead is huge, with HTTP latency/ 
>> overhead at about 80% of  our document read time when loaded from  
>> local client (same machine). Once we can get this down to below  
>> 50%, we'll focus on optimizing the database and other component.  
>> Most core database operations, document reads, updates and view  
>> indexing are completely unoptimized so far, which the update speed  
>> being the biggest complaint.
>>
>> Testing: We need lots more tests. By the time we ship 1.0, we  
>> should have far more test suite code than production code. And we  
>> need to do load testing. Will the current browser based test suite  
>> can scale for this kind of heavy testing?
>>
>> Nice to have:
>>
>> Plugs in: Erlang module plug-in architecture, to make adding new  
>> server side code easy. Right now the code that maps special urls  
>> (_view, _compact, _search, etc) to the appropriate Erlang call is  
>> messy and convoluted, and getting worse as we go. We need a  
>> standard way to map the special urls to the appropriate Erlang call.
>>
>> Tail committed database headers: To optimize the updating of  
>> database by reducing the number and length of seeks required, the  
>> file header should be written to the end of the file, rather than  
>> the beginning. Depending on platform this can remove a full  
>> headseek and in the best case scenario a document insert/update can  
>> require zero head seeks (if the head is already positioned at the  
>> end of the file). But this can slow file opening speed as it may  
>> need to do a search in the file for the most recent valid header.  
>> In the result of a crash, the header scan/search cost at database  
>> open can be linear or logarithmic, depending on the exact  
>> implementation.
>>
>> Clustering: The ability to cluster CouchDB servers, to increase  
>> both reliability (failover-clustering) and client scalability (more  
>> servers to handle more concurrent user load). Clustering does not  
>> increase data scalability, which is  (that's partitioning/sharding).
>>
>> Selective document purging/compaction: Deletion stubs are kept  
>> around for replication purposes. Need a way to purge the records of  
>> document that are old or deleted.
>>
>> Revision rev path pruning: Each document keeps a list of all  
>> previous revisions. We need a way to prune the oldest records of  
>> document revisions and remerge pruned lists during replication.
>>
>> Don't Need:
>>
>> Authentication. We can go to 1.0 without authentication, relying  
>> instead on local proxies to provide authentication.
>>
>> Partioning. Partitioning is a big project with lots of  
>> considerations. It's best to move this post 1.0.
>
>


Mime
View raw message