Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 90073 invoked from network); 6 Jan 2010 21:02:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Jan 2010 21:02:12 -0000 Received: (qmail 79363 invoked by uid 500); 6 Jan 2010 21:02:10 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 79274 invoked by uid 500); 6 Jan 2010 21:02:10 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 79264 invoked by uid 99); 6 Jan 2010 21:02:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 21:02:10 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jchris@gmail.com designates 209.85.216.180 as permitted sender) Received: from [209.85.216.180] (HELO mail-px0-f180.google.com) (209.85.216.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jan 2010 21:02:02 +0000 Received: by pxi10 with SMTP id 10so13082868pxi.13 for ; Wed, 06 Jan 2010 13:01:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=HyzLY1JSaotFmUAd1p8xgxtzcV/yLFw+WXhM9FN3JJM=; b=kgtkLQ6qOJPiSszdD/lowzw2iGcjTKMt+nukp5b6EhjBxOeqHsqjJA3SzQm5EcjTjn 7iD+V2FkUTdAwZbh+NJlkjDij+XilqSNO+5u86d2tjwmOemGethXbaMRbnxLobKRuxdB WM5DQlE4A8a75AwGDhKtzwo6pXA0OwOlDtLLg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=MbWmdpoFU0baUP2iCmNs4Mpb1V8BQu0u8WAw7E8THOIjGAldQJ4/O1y/NGQkkVlfwr rykf2JLxEUYjktnTyOnU3+rjSVZXYWc1KUyJtNdqJWe583UdCmNdS8LlTN3vlu+BjptP h1ynOYH0I3VepIqVTaeZEmc77HaD5VrcGKsfM= MIME-Version: 1.0 Sender: jchris@gmail.com Received: by 10.142.119.4 with SMTP id r4mr13795629wfc.65.1262811702048; Wed, 06 Jan 2010 13:01:42 -0800 (PST) In-Reply-To: References: Date: Wed, 6 Jan 2010 13:01:42 -0800 X-Google-Sender-Auth: b147f383d70fef83 Message-ID: Subject: Re: Building IFI View for Text Queries From: Chris Anderson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Jan 6, 2010 at 12:57 PM, Nic Pottier wrote: > On Wed, Jan 6, 2010 at 12:39 PM, Chris Anderson wrote= : >>> Any way to get an insight as to how big the index is? =A0I can see how >>> big my database is (78M with ~11k docs) but I'd be curious to know how >>> big that view is stored in memory. >> >> The view is stored on disk. Look in the CouchDB data directory >> /usr/local/var/lib/couchdb for the view directory. > > I only see the primary database file here, so I guess I get a feeling > for the total size, but not what portion of that size is from the > view. =A0I suppose I could delete it, look at the size, then rebuild, > comparing the growth? there is actually a separate index directory called .my_db_name_design/ inside that directory. within it is 1 index file per design document. that file size is the actual index size. > >> Our reduce is not key-bounded, so [id array] would end up being the >> list of unique ids in the entire database for full-reduce. > > Ok, that's kind of what I suspected. =A0Are there any plans to offer > multiple levels of mapping? =A0It seems like it would still fit into > pattern of individual updates and tree aggregation and could allow for > fast recreation of these kind of indexes. =A0Just a random question / > idea.. we're definitely into alternate query engines. Lucene is pretty popular with CouchDB, and the way it is kept up to date is the same architecture as the "built-in" map reduce. This is also how you'd hook up SQLite or Neo4j > >> The storage inefficiency you describe is likely what would force you >> from a pure Couch to a Lucene FTI solution first, as your data begins >> to scale. > > Understood.. I'll take another look at the Lucene integration again.. > how many people are using that? > --=20 Chris Anderson http://jchrisa.net http://couch.io