Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 449 invoked from network); 25 Nov 2008 12:04:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Nov 2008 12:04:23 -0000 Received: (qmail 28344 invoked by uid 500); 25 Nov 2008 12:04:32 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 28311 invoked by uid 500); 25 Nov 2008 12:04:32 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 28300 invoked by uid 99); 25 Nov 2008 12:04:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Nov 2008 04:04:32 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ben324@gmail.com designates 209.85.134.190 as permitted sender) Received: from [209.85.134.190] (HELO mu-out-0910.google.com) (209.85.134.190) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Nov 2008 12:03:06 +0000 Received: by mu-out-0910.google.com with SMTP id w9so2931805mue.0 for ; Tue, 25 Nov 2008 04:03:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=vzSOapzGj1rcjaUL5LTFNu4os+zBZnQU318Rykqg3bM=; b=mnlQen9iZFPLxSpCUHiNt/tV6lMneO7PmO1T7CeRCwxVrpCXi5KkoumeXrZPggXoWH ypWTcwfuWI0p2z7KlTT//MyymgH5TEE+GTwfjPoPnzlYaHhvyn4M4YbJjFWBJQkdtZjz 0uHVwYd8Bk23Vmi42L9P/M5ibys5pA5wAYsKU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=UXtodhMHm9S0M8VdqOpQ2K6axP/4YUEx3ivxj/DK/ID7i9sJVefXISdJtA0CLc2HzR wRyAA5mmXCZYdXFhNAdPHvUxi4Z6gBbS5uLglyDBhB6BL412J8sWELR6hEuRqARPurW2 2Gt+qaZYqDB9NKp56oc+tzPIrwOWdX3hrNQW8= Received: by 10.181.23.2 with SMTP id a2mr1504822bkj.166.1227614631383; Tue, 25 Nov 2008 04:03:51 -0800 (PST) Received: by 10.180.203.15 with HTTP; Tue, 25 Nov 2008 04:03:51 -0800 (PST) Message-ID: Date: Tue, 25 Nov 2008 07:03:51 -0500 From: "Ben Browning" To: couchdb-user@incubator.apache.org Subject: Re: another CouchDB evaluation Q In-Reply-To: <8DB69C88-FC6B-4AC6-93BA-5034AEE9F742@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <8DB69C88-FC6B-4AC6-93BA-5034AEE9F742@gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org I also have a system that will be doing a large number of small writes. I'll try to answer your questions as best I can based on my experience so far. On Mon, Nov 24, 2008 at 11:06 PM, Liam Staskawicz wrote: > - is there anything intrinsic to the CouchDB design that makes it less than > ideal for frequent small writes? I've read a lot about how CouchDB is good > for read-mostly applications, but not much about what kinds of tradeoffs or > compromises might be involved in write-heavy use. >From my initial testing, write speed is not CouchDB's strong point. With constant small writes your data files will grow very quickly and need frequent compaction. If you are always under heavy write load compaction will take much longer and may require you to pause or throttle writes to allow compaction to finish. To combat these problems I plan to have multiple write-slaves. Basically, writes will be load-balanced across the write slaves and then replicated from the slaves to my read master. If a delay between data being written and when it can be read is acceptable (and from your description it sounds like that's fine) then this is the best approach I see. You can add more write slaves as needed and they act as a sort of buffer. Set up a mechanism to periodically replicate the writes from the slaves to the master and then perform your analysis on the data in the master. Replication should be much faster than multiple small writes and not require as frequent compaction on the master. The slaves won't need compaction since you can just create a new database on a particular slave, add that new database to the load balancer pool, remove that slave's existing database from the pool, let that database finish replicating its changes, and delete it. My application isn't in production yet but I hope to have it going by the end of the year or very early next year. So, take this advice with a grain of salt until I actually see the real-world performance. It sounds good on paper though. > - is the notion of a variety of sources (threads) of write data an issue? > I've gathered that it's not from my investigations so far, but would like > to make sure I've understood everything properly. There are no limitations here that I know of. I've got a web front-end that's heavily concurrent inserting data and haven't seen any issues. - Ben