Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 76093 invoked from network); 29 Apr 2009 22:23:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Apr 2009 22:23:10 -0000 Received: (qmail 95587 invoked by uid 500); 29 Apr 2009 22:23:09 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 95509 invoked by uid 500); 29 Apr 2009 22:23:09 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 95494 invoked by uid 99); 29 Apr 2009 22:23:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Apr 2009 22:23:09 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [68.142.237.123] (HELO n10.bullet.re3.yahoo.com) (68.142.237.123) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 29 Apr 2009 22:22:59 +0000 Received: from [68.142.237.87] by n10.bullet.re3.yahoo.com with NNFMP; 29 Apr 2009 22:22:38 -0000 Received: from [69.147.75.191] by t3.bullet.re3.yahoo.com with NNFMP; 29 Apr 2009 22:22:38 -0000 Received: from [127.0.0.1] by omp107.mail.re1.yahoo.com with NNFMP; 29 Apr 2009 22:22:38 -0000 X-Yahoo-Newman-Id: 546357.55423.bm@omp107.mail.re1.yahoo.com Received: (qmail 15762 invoked from network); 29 Apr 2009 22:22:38 -0000 Received: from unknown (HELO Ottoman.svl.ibm.com) (damien@32.97.110.50 with plain) by smtp114.plus.mail.re1.yahoo.com with SMTP; 29 Apr 2009 22:22:38 -0000 X-YMail-OSG: ctdWm.oVM1l.UbSIQQCxWKisyvErn30nnoJaXcGd_6TSbVL70fO2jPzBjsnRm2vleE0FwK1QzplFpgyZNPQ_9FTFBDaaZZSIsOC6xzYx0adt7JO.AgHMXHz6AdTnr4XKK894vLVfc9nr0aly7Fr.d4wIl9VU7NtMcVYyqpIYYZKUWAS1v4pp4H6qsNRqAvp09R.85GNI6irtOm3tmKzsYhvkqr3vre3g9.wVsH7DvmuSlM4tycYkCBjdIQeAn1nzrcDfpZxXnFyh2LtguLHYvgSeMFxoAqe7At1Wl7RCXySSyBPnZGUia7hZeSeUgfJ.0CTRJHbuxr6DkM8- X-Yahoo-Newman-Property: ymail-3 Message-Id: From: Damien Katz To: user@couchdb.apache.org In-Reply-To: <20090429221138.GA22995@translab.its.uci.edu> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: how much stuff can one stuff into a CouchDB? Date: Wed, 29 Apr 2009 15:22:36 -0700 References: <20090429221138.GA22995@translab.its.uci.edu> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org I think the total size of the data isn't a problem, but each access will require loading up the whole document into memory and 25 megs of data to read or update a small bit data, which can be quite inefficient. Views will be slow to update for each document change too, as it needs to load the whole thing into memory and serialize to the view engine, etc. If the documents can be broken up into smaller updatable units, then things will work more smoothly generally, but still somewhat slow when building views. If the data can be stored as binary attachment, with just some meta data about the files stored in the JSON, views will be much more efficient. -Damien On Apr 29, 2009, at 3:11 PM, James Marca wrote: > Hi All, > > On the Wiki, the FAQ says: > Q: How Much Stuff can I Store in CouchDB? > A: With node partitioning, virtually unlimited. For a single database > instance, the practical scaling limits aren't yet known. > > Is there some more recent guidance on this? I read the wiki pages > "Configuring distributed systems", "Partitioning proposal", and "HTTP > Bulk Document API", but as far as I tell, node partitioning isn't > implemented yet (right? things are moving really fast around here!). > I have about 70G of gzipped files (about 3,000 files) that I need to > unzip. convert to json, and store. Unzipping explodes each file by > about a factor of 7. I expect that adding the JSON structure will > increase the data size even more. I read in an earlier posting that > compacting the database will compress it back down significantly, but > still, that's a big database file. > > I also have the option to break up the data into 9 logical chunks, but > if I don't have to do it, I'd rather not. > > Anybody have any advice or experience with really big databases? > > Regards, > James > -- > James E. Marca > Researcher > Institute of Transportation Studies > AIRB Suite 4000 > University of California > Irvine, CA 92697-3600 > jmarca@translab.its.uci.edu > > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. >