From couchdb-user-return-1691-apmail-incubator-couchdb-user-archive=incubator.apache.org@incubator.apache.org Mon Nov 03 05:24:05 2008 Return-Path: Delivered-To: apmail-incubator-couchdb-user-archive@locus.apache.org Received: (qmail 65167 invoked from network); 3 Nov 2008 05:24:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Nov 2008 05:24:02 -0000 Received: (qmail 73550 invoked by uid 500); 3 Nov 2008 05:24:07 -0000 Delivered-To: apmail-incubator-couchdb-user-archive@incubator.apache.org Received: (qmail 73515 invoked by uid 500); 3 Nov 2008 05:24:07 -0000 Mailing-List: contact couchdb-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: couchdb-user@incubator.apache.org Delivered-To: mailing list couchdb-user@incubator.apache.org Received: (qmail 73504 invoked by uid 99); 3 Nov 2008 05:24:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Nov 2008 21:24:07 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul.joseph.davis@gmail.com designates 74.125.92.144 as permitted sender) Received: from [74.125.92.144] (HELO qw-out-1920.google.com) (74.125.92.144) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Nov 2008 05:22:50 +0000 Received: by qw-out-1920.google.com with SMTP id 4so1109111qwk.54 for ; Sun, 02 Nov 2008 21:23:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=Glhp65xzWKcEP8uPCap8S9cUbnTph8G4nBld4bZjeB8=; b=OCM1HWcgCYEWBHhFzC6vELlLzlQwl6UFxsCRwwVnCeFHNmoz4I4B4mB/3LHEScjuqz y0M9a5Niv6yFZLzAgJEMH96prGWXCpFQfqnB05EEcy7AFyqcA3Ked/L+6u884sQk8uhL buK0/vJagQJRxm9mO6sbm1OSQrhDriLHZcT+8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=rhrPyaAIs3KLhYqWSig3uExNi+p6XIMqHrIzvuDiRuCy0c3xfVSwjL2KVdXo4X/B5N ZoZHspqaJrAoUwOHGTsA2fYgxaoKa91O0JVizFcJ8fZn3kJsrCDAp0+DIWRuR8L2rIb/ kEzLbzwbapaWhA0qKQVRy9gTHiTWpXs8OY09Q= Received: by 10.215.100.9 with SMTP id c9mr4249149qam.23.1225689800883; Sun, 02 Nov 2008 21:23:20 -0800 (PST) Received: by 10.214.215.21 with HTTP; Sun, 2 Nov 2008 21:23:20 -0800 (PST) Message-ID: Date: Mon, 3 Nov 2008 00:23:20 -0500 From: "Paul Davis" To: couchdb-user@incubator.apache.org Subject: Re: Largest CouchDB dbs? In-Reply-To: <0E689907656A4B499E477CBB9A24044707FAF6D0@sryulwis0comx01.coradiant.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <8C1EE3D1-D62F-4C6A-A859-196297E09C3D@apache.org> <0E689907656A4B499E477CBB9A24044707FAF6D0@sryulwis0comx01.coradiant.com> X-Virus-Checked: Checked by ClamAV on apache.org I can't say that I've seen numbers for data sizes this big. The biggest number you've got is the rate of turnover in the database. CouchDB has quite a few read-heavy optimization design decisions, so a data flow that is lopsided toward writes might be harder to work with. That being said, my recommendation is to check what type of throughput you can get with compaction and replication. I haven't seen numbers on either of those things, both of which would be very necessary for this scenario I think. HTH, Paul Davis On Sun, Nov 2, 2008 at 11:53 PM, Jonathan Ginter wrote: > I have a similar issue. I am interested in using CouchDB to host a 200+ = GB database that will receive well over 200 million documents per day. Mor= eover, the data must roll out - i.e., constant background purging - and als= o support UI queries. And this is just a starting point to match the abili= ties of the relational database we are already running. I will want the DB= to scale up from there. > > If there is no hope of the CouchDB being able to handle all of that - reg= ardless of how many machines we deploy - I would like to know that now befo= re I look any further into this project. > > Does anyone have a reasonable idea about whether CouchDB will be capable = of such massive scalability or how many machines it would take to scale tha= t large? > > I would appreciate any feedback that anyone might have on this. > > Jonathan > > -----Original Message----- > From: Paul Davis [mailto:paul.joseph.davis@gmail.com] > Sent: Sunday, November 02, 2008 5:30 PM > To: couchdb-user@incubator.apache.org > Subject: Re: Largest CouchDB dbs? > > Largets one I know of: > > http://www.lixo.org/archives/2008/11/02/announcing-lotsofwordscom/ > > On Sun, Nov 2, 2008 at 5:23 PM, Ask Bj=F8rn Hansen wrote= : >> What are the largest known production DBs in CouchDB? >> >> I'm loading ~3M documents into a database that'll be probably around 10G= B >> and then grow from there. So not very much data, but inserting and upda= ting >> views is much slower than I expected (or thought they were in tests on >> earlier versions, is that possible?) >> >> >> >> - ask >> >