Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 25067 invoked from network); 21 Dec 2010 13:56:25 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Dec 2010 13:56:25 -0000 Received: (qmail 30898 invoked by uid 500); 21 Dec 2010 13:56:23 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 30724 invoked by uid 500); 21 Dec 2010 13:56:22 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 30716 invoked by uid 99); 21 Dec 2010 13:56:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Dec 2010 13:56:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.213.180 as permitted sender) Received: from [209.85.213.180] (HELO mail-yx0-f180.google.com) (209.85.213.180) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Dec 2010 13:56:15 +0000 Received: by yxm34 with SMTP id 34so1886927yxm.11 for ; Tue, 21 Dec 2010 05:55:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=gTRKYvqWA3a/zZXKLzkEXqQEEQJggicEnjO8bBI2+IM=; b=gagEV2uZJCGNSt+F2UnMiH32S7HCQO19tEjlQaKRcXJbcBB1opG0yW2z0G7afXJEtu LIx/h1PSm7l0oiq/icZDQIfdSUO1aI0Nck1ZdiDlMdARccuv9MybtIUnFMuVYMsAetsA IZ9imHFUlasLaLoSHjaibbvklw42bTfViy2MA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=bI2m3vdiI2vJ53UODefyY4kuZYfGmDwcTfOrznXncSeQjQNA/MOerkuoFIOzLWTsJn /aViIyXlzzeDNEy5lq16n/Io8CbGdR0/0me/cbmiX2S4hCpjz56uWVktWTUgZUfoDHV7 5rz87ZrnUdpsHMGed9p2B4ZHapJpaXOimFt5E= Received: by 10.151.103.14 with SMTP id f14mr8651044ybm.319.1292939753323; Tue, 21 Dec 2010 05:55:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.146.82.13 with HTTP; Tue, 21 Dec 2010 05:55:13 -0800 (PST) In-Reply-To: <4D107979.5030805@bclary.com> References: <4D107979.5030805@bclary.com> From: Paul Davis Date: Tue, 21 Dec 2010 08:55:13 -0500 Message-ID: Subject: Re: CouchDB becoming unusable as Database/Views increase in size. To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Tue, Dec 21, 2010 at 4:55 AM, Bob Clary wrote: > Hi all, > > I've been using CouchDB to track the results of testing Firefox and have > found that as the database and view sizes have increased CouchDB is becoming > less and less viable as a solution going forward. I don't wish to switch to > a different database at this time but may not have a choice. > > Let me say that I have looked at Jira and found others with similar issues > although issues have mostly been resolved as invalid or already fixed. I do > admit that I have a hard time navigating Jira, so it is entirely possible > I've missed already filed issues. I am not sending this email in a > threatening fashion that I've seen many times in bugzilla where a user says > "Fix this or I'm leaving!", but in a plea for some help in finding, filing > or fixing the appropriate Jira issues which need attention. > > My database currently has a compacted size of about 37G and contains a bit > over 9 million documents. You can see examples of the view documents in the > error log I attached to . > The immediate thing you could do would be to use BigCouch. Even if you're using multiple BigCouch nodes on a single machine it should still help you with initial file sizes and view indexing times. > I am currently using CouchDB 1.0.1 on Centos5 64bit vm with 2CPU and 4G RAM > running Erlang R14B and configured to use the 64bit js-devel libraries. I > temporarily tried to use CouchDB 1.0.x to pick up the fix for > which was causing me > problems but had to revert to 1.0.1 due to crashes upon view compaction > completion. > > Currently, my main issues are: > > Slow View generation: Recreating views from scratch is very slow. It can > take me over 24 hours to recreate some of the larger views. Combined with > the need to immediately compact them (see Large Initial View sizes) > recreating views can take my application offline for users for more than a > day. Trying to switch to 1.0.x and back and having to regenerate views after > out of space conditions has led to my application being unavailable for most > of a week. > View generation is definitely slower than I'd like. Again, in the immediate short term, a switch to BigCouch will help you here because you can rebuild parts of a view independently which will help with time and disk space. > Large Initial View sizes: Several of my views are initially created with > sizes which are 10-20 times the size of the compacted view. For example, I > have one view which when initially created can take 95G but when compacted > uses less than 5G. This has caused several out of disk space conditions when > I've had to regenerate views for the database. I know commodity disks are > relatively cheap these days, but due to my current hosting environment I am > using relatively expensive networked storage. Asking for sufficient storage > for my expected database size was difficult enough, but asking for 10 or > more times that amount just to deal with temporary explosive view sizes is > probably a non-starter. > How do you have your views laid out? Remember that a design document is indexed all at once in a single file, so its possible you could get seedups and smaller files by splitting them across multiple design docs. Also, in 1.0.1 you should have the ability to create a view before using it. Ie, you create the _design doc with a random id, and build its views, then rename it to its final destination. Also, depending on your reductions, if you can, its best to use the built in reductions. > CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was a > failure due to the crashing immediately upon view compaction completion > which caused the views to begin indexing from scratch. > This is a serious unreported bug. Please add any crash logs to Jira so we can figure out what's going on here. > I would appreciate it if you would let me know if some of these are known > issues which have already been filed in Jira or if it would be helpful to > file new issues and what additional information I can provide to help get > these issues resolved. > > I can also help in making newer releases of SpiderMonkey 1.7 available and > to help get SpiderMonkey 1.8 and later released if that will help the > JavaScript performance issues CouchDB may be facing. > I think you'll definitely notice an change with that upgrade. The more complicated your views are, the more of an impact it should have. > bc > > HTH, Paul Davis