Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 71708 invoked from network); 9 Jan 2011 22:13:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Jan 2011 22:13:42 -0000 Received: (qmail 18944 invoked by uid 500); 9 Jan 2011 22:13:41 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 18907 invoked by uid 500); 9 Jan 2011 22:13:41 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 18899 invoked by uid 99); 9 Jan 2011 22:13:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Jan 2011 22:13:41 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of zengeneral@gmail.com designates 209.85.216.52 as permitted sender) Received: from [209.85.216.52] (HELO mail-qw0-f52.google.com) (209.85.216.52) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Jan 2011 22:13:34 +0000 Received: by qwi4 with SMTP id 4so19954231qwi.11 for ; Sun, 09 Jan 2011 14:13:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=wQoc8lMYxp+47QDzr3lQguKu5WrOu4JLNcozk4bFLi0=; b=GmPQNiHdHObILD3BPM+cveAF56q9v6XVl5uyhXyMuVu7j+1U2eVpcVV7D5dh5+75F7 chRQgiAZgwBW2fCklTR4w2Equ14LQOLnqGVbiwqKBcgOhJK+qng3JYkOk0NmdRQBztVJ Ys7NkhTL69lAQ7sp/TXDMmd105sGtDKW2JlkM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=LANhJmIbE+lM7Eax0dsWjiLOT9Gar35OJmMJVP70LWWV2vuuWG9s4VYyPxni8FqdC+ jxC4GE50b3eMfztqTCJKHFvjvYBGrCZ8nv1eefSplDd0S4rIZ7ezH+jOCPOwBW1DTRGZ J3Q275AYVzDA0tZ1kZv8jiVts8TZRLvMNoI2Y= MIME-Version: 1.0 Received: by 10.229.91.147 with SMTP id n19mr5120799qcm.153.1294611192456; Sun, 09 Jan 2011 14:13:12 -0800 (PST) Received: by 10.229.39.144 with HTTP; Sun, 9 Jan 2011 14:13:12 -0800 (PST) In-Reply-To: References: <4D2A1E1E.4090500@bclary.com> Date: Sun, 9 Jan 2011 16:13:12 -0600 Message-ID: Subject: Re: operational file size From: "Jeffrey M. Barber" To: user@couchdb.apache.org Content-Type: multipart/alternative; boundary=0016363106453fa5f5049971288c X-Virus-Checked: Checked by ClamAV on apache.org --0016363106453fa5f5049971288c Content-Type: text/plain; charset=ISO-8859-1 Thank you both. On Sun, Jan 9, 2011 at 4:05 PM, Randall Leeds wrote: > On Sun, Jan 9, 2011 at 12:44, Bob Clary wrote: > > Jeffrey, > > > > Randal makes several good points and covers many of the issues you will > need > > to handle however I'd like to chime in with some the lessons I have > learned > > from my experiences. > > > > The estimate that your maximum database size should be less than 1/2 of > your > > free disk space is a good starting point but you need to also consider > the > > disk space consumed by your views. They also will require a maximum of > twice > > their size to compact. If your view sizes are on the same order as your > > database size, then you can expect your maximum database size to be 1/4 > of > > your free disk space. This doesn't take into account the current issue in > > CouchDB where some initial view sizes may be 10-20 times of their final > > compacted size. > > > > Regularly compacting your database *and* views is critical to limiting > your > > maximum disk usage. Until the issue where compaction leaves file handles > > open for deleted old copies of files is resolved you will also need to > > periodically restart your CouchDB server in order to free the space from > the > > old versions of the files. Monitoring not only the database and view > sizes > > but also the actual free space reported by the system is important. If > you > > see the free space continuing to decrease to a dangerous level after > > repeated compactions you need to restart the database or risk running out > of > > space on the entire machine. > > > > The issue you refer to is here[1] and it's been fixed for the upcoming > 1.0.2 and 1.1 releases. > > > The replication strategy to bigger machines will work up to a point (see > > below) as long as the load on your database is not too great and the > > database and views do not need to be compacted too often. However > > replicating a large database with millions of documents will take a long > > time and you may not have sufficient time to move to a larger machine > before > > you run out of space if the database and views need to be compacted > several > > times during the replication. > > > > Finally, once your database views grow large enough you will run into the > > issue where CouchDB will crash after compacting your views, resulting in > the > > view being deleted and having to be recreated from the beginning. This > view > > creation-compaction-crash-creation cycle can take more than a day with a > > large database, will leave any parts of your application which depend on > > these views unusable and won't be resolved through replication to a > machine > > with a larger disk. > > > > That's a more disturbing issue and it looks like no one's addressed it > yet. I'll comment on the JIRA ticket and see if we can get some > movement on it. I know it hasn't been around forever, since older > releases did not exhibit this behavior. I bet we can track it down. > > > In summary I think the initial free disk space should be 4 times the > > expected size of your database and, depending on your views, that there > is > > currently an absolute limit beyond which CouchDB will become unusable. In > my > > case it was a compacted database of 40G of about 10 million documents. > > > > bc > > > > On 1/8/11 12:31 PM, Randall Leeds wrote: > >> > >> It's hard to estimate how big the compacted database will be given the > >> size of the original. In the worst case (when your database is already > >> compacted), compacting it again will double your usage, since it > >> creates a whole new, optimized copy of the database file. > >> > >> More likely is that the original is not compact and so the new file > >> will be much smaller. > >> > >> Clearly, then, the answer is that if you want to be ultra safe no > >> single database should exceed 50% of your capacity. However, it is > >> safe to have many small databases such that the total disk consumption > >> is much higher. > >> > >> The best solution is to regularly compact your databases and track the > >> usage and size differences so you get a good sense of how fast you're > >> growing. And remember, if you find yourself in a sticky situation > >> where you can't compact you probably still have plenty of time to > >> replicate to a bigger machine or a hosted cluster such as offered by > >> Cloudant. Good monitoring is the best way to avoid disaster. > >> > >> On Sat, Jan 8, 2011 at 10:39, Jeffrey M. Barber > >> wrote: > >>> > >>> If I'm running CouchDB with 100GB of disk space, what is the maximum > >>> CouchDB > >>> database size such that I'm still able to optimize? > >>> > >>> I remember running out of room on a rackspace machine, and I got the > >>> strangest of error codes when trying to run CouchDB. > >>> > >>> -J > >>> > >> > > > > > > [1]https://issues.apache.org/jira/browse/COUCHDB-926 > --0016363106453fa5f5049971288c--