Return-Path: Delivered-To: apmail-incubator-couchdb-dev-archive@locus.apache.org Received: (qmail 18590 invoked from network); 30 Apr 2008 15:48:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Apr 2008 15:48:23 -0000 Received: (qmail 67791 invoked by uid 500); 30 Apr 2008 15:48:25 -0000 Delivered-To: apmail-incubator-couchdb-dev-archive@incubator.apache.org Received: (qmail 67763 invoked by uid 500); 30 Apr 2008 15:48:25 -0000 Mailing-List: contact couchdb-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-dev@incubator.apache.org Delivered-To: mailing list couchdb-dev@incubator.apache.org Received: (qmail 67751 invoked by uid 99); 30 Apr 2008 15:48:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Apr 2008 08:48:25 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [83.97.50.139] (HELO jan.prima.de) (83.97.50.139) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Apr 2008 15:47:31 +0000 Received: from [192.168.1.33] (e179072074.adsl.alicedsl.de [::ffff:85.179.72.74]) (AUTH: LOGIN jan, SSL: TLSv1/SSLv3,128bits,AES128-SHA) by jan.prima.de with esmtp; Wed, 30 Apr 2008 15:47:50 +0000 Message-Id: <46A7D0DE-ABE2-4533-8D52-61CFB84E088E@apache.org> From: Jan Lehnardt To: couchdb-dev@incubator.apache.org In-Reply-To: <6A14F004-1449-4FC9-A2EE-47BC1CAF9FED@yahoo.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Re: CouchDB 1.0 work Date: Wed, 30 Apr 2008 17:47:19 +0200 References: <6A14F004-1449-4FC9-A2EE-47BC1CAF9FED@yahoo.com> X-Mailer: Apple Mail (2.919.2) X-Virus-Checked: Checked by ClamAV on apache.org Additional thoughts: Must have: Refactoring of attachment API as per earlier discussions and proposal by Christopher. Nice to have: In the course of speed and load tests and ultimately when running a live sysyem, it would be nice to have more introspection. Basically counters, stats to evaluate the state of a CouchDB node. Cheers Jan -- On Apr 28, 2008, at 18:27, Damien Katz wrote: > Here are my thoughts on what we need for before we can get to > CouchDB 1.0. Feedback please. > > Must have: > > Incremental reduce: Maybe single biggest outstanding work item. > Probably 2 weeks of development to get to a testable state > > Security/Document validation: We need a way to control who can > update what documents and to validate the updates are correct. This > is absolutely necessary for offline replication, where replicated > updates to the database do not come through the application layer. > > View index compaction/management: View indexes currently just grow, > need a compaction similar to storage compaction. Also, there is no > way to purge old unused indexes, except via the OS. > > File sync problem: file:sync(), a call that flushes all uncommitted > writes to disk before returning, doesn't work fully or at all on all > some platforms (usually we just lack the flags to tell the OS to > write to disk). Should be fixable by either patching the existing > Erlang driver source, or using a replacement file driver. > > Optimizations. Right now HTTP overhead is huge, with HTTP latency/ > overhead at about 80% of our document read time when loaded from > local client (same machine). Once we can get this down to below 50%, > we'll focus on optimizing the database and other component. Most > core database operations, document reads, updates and view indexing > are completely unoptimized so far, which the update speed being the > biggest complaint. > > Testing: We need lots more tests. By the time we ship 1.0, we should > have far more test suite code than production code. And we need to > do load testing. Will the current browser based test suite can scale > for this kind of heavy testing? > > Nice to have: > > Plugs in: Erlang module plug-in architecture, to make adding new > server side code easy. Right now the code that maps special urls > (_view, _compact, _search, etc) to the appropriate Erlang call is > messy and convoluted, and getting worse as we go. We need a standard > way to map the special urls to the appropriate Erlang call. > > Tail committed database headers: To optimize the updating of > database by reducing the number and length of seeks required, the > file header should be written to the end of the file, rather than > the beginning. Depending on platform this can remove a full headseek > and in the best case scenario a document insert/update can require > zero head seeks (if the head is already positioned at the end of the > file). But this can slow file opening speed as it may need to do a > search in the file for the most recent valid header. In the result > of a crash, the header scan/search cost at database open can be > linear or logarithmic, depending on the exact implementation. > > Clustering: The ability to cluster CouchDB servers, to increase both > reliability (failover-clustering) and client scalability (more > servers to handle more concurrent user load). Clustering does not > increase data scalability, which is (that's partitioning/sharding). > > Selective document purging/compaction: Deletion stubs are kept > around for replication purposes. Need a way to purge the records of > document that are old or deleted. > > Revision rev path pruning: Each document keeps a list of all > previous revisions. We need a way to prune the oldest records of > document revisions and remerge pruned lists during replication. > > Don't Need: > > Authentication. We can go to 1.0 without authentication, relying > instead on local proxies to provide authentication. > > Partioning. Partitioning is a big project with lots of > considerations. It's best to move this post 1.0. >