Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 91298 invoked from network); 21 Dec 2010 15:58:39 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Dec 2010 15:58:39 -0000 Received: (qmail 26022 invoked by uid 500); 21 Dec 2010 15:58:37 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 25820 invoked by uid 500); 21 Dec 2010 15:58:37 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 25812 invoked by uid 99); 21 Dec 2010 15:58:36 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Dec 2010 15:58:36 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.213.180 as permitted sender) Received: from [209.85.213.180] (HELO mail-yx0-f180.google.com) (209.85.213.180) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Dec 2010 15:58:31 +0000 Received: by yxm34 with SMTP id 34so1941651yxm.11 for ; Tue, 21 Dec 2010 07:58:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=7m8Cf03+A7edyw5S5THQouNUhz53Td0HSJjLafEsmSQ=; b=rM+kP9t4ueGnoyE85icflnwF6oyiqt195KCWAsqa8gyh1Rrhd4b4XOGoLLlUOVDxXx SGRNFsdzbvgfdt/w5Ga7esgG4XxOq1XtMCbK6+xIq0XwrdtVhH3IX8lQH8J/jpfZJUCr PT1FxhGQd0MlWjlCwsUlXD9uCypdYp00PzOdc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=ABEOMj5e/MWvdH1LG/Byl3wjL6BunABsZrnrrcTLdkd7zjPVR6NZs6k62HHuW2uzAZ WAE0Up4VkdhNmGTOT/V7Ol1mwaW0jFsPwc9xBgNeByZmtJ/2W0AvQld0/n3j4s+kCdKo VbVctzGveDMpPXe1lrFssrEQgFL35rqWyPh7I= Received: by 10.151.103.14 with SMTP id f14mr8854937ybm.319.1292947090327; Tue, 21 Dec 2010 07:58:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.146.82.13 with HTTP; Tue, 21 Dec 2010 07:57:30 -0800 (PST) In-Reply-To: <7A77A6A4-7414-4F1A-89B8-F72C7EA4F07B@apache.org> References: <4D107979.5030805@bclary.com> <7A77A6A4-7414-4F1A-89B8-F72C7EA4F07B@apache.org> From: Paul Davis Date: Tue, 21 Dec 2010 10:57:30 -0500 Message-ID: Subject: Re: CouchDB becoming unusable as Database/Views increase in size. To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Tue, Dec 21, 2010 at 10:39 AM, Adam Kocoloski wrot= e: > On Dec 21, 2010, at 4:55 AM, Bob Clary wrote: > >> Large Initial View sizes: Several of my views are initially created with= sizes which are 10-20 times the size of the compacted view. For example, I= have one view which when initially created can take 95G but when compacted= uses less than 5G. This has caused several out of disk space conditions wh= en I've had to regenerate views for the database. I know commodity disks ar= e relatively cheap these days, but due to my current hosting environment I = am using relatively expensive networked storage. Asking for sufficient stor= age for my expected database size was difficult enough, but asking for 10 o= r more times that amount just to deal with temporary explosive view sizes i= s probably a non-starter. > > This one is being worked on in https://issues.apache.org/jira/browse/COUC= HDB-700 . =A0Guaranteeing a minimum batch size results in a smaller index f= ile and also speeds up indexing in many circumstances. > >> CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was= a failure due to the crashing immediately upon view compaction completion = which caused the views to begin indexing from scratch. > > I agree with Paul that the timeout dropping a ref counter at the end of v= iew compaction is a significant bug. =A0I'm guessing it depends on the part= icular deployment and size of the file being deleted. =A0There have been mu= ltiple attempts [1,2] to rewrite the reference counting system; one of thos= e should probably be merged for 1.2.0. =A0We might be able to have some sto= pgap fix for 1.0.x and 1.1.x. > > I also have to agree with Mike and Paul that BigCouch would help you a lo= t here. =A0Even if you use it in a single-node setup the ability to split a= large monolithic database into an arbitrary number of shards can help trem= endously when trying to build and compact indexes. =A0Regards, > I should've mentioned this in my earlier email as well, but I'll underscore the point that using BigCouch to shard your db on a single node would still help in splitting the unit of work for a single database. > Adam > > [1]: https://github.com/tilgovi/couchdb/tree/ets_ref_count > [2]: https://github.com/cloudant/bigcouch/blob/master/apps/couch/src/couc= h_file.erl#L483