Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@couchdb.apache.org
Received-SPF: pass (athena.apache.org: domain of jchris@gmail.com designates
 209.85.216.180 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:sender:in-reply-to:references:date
         :x-google-sender-auth:message-id:subject:from:to:content-type
         :content-transfer-encoding;
        b=MbWmdpoFU0baUP2iCmNs4Mpb1V8BQu0u8WAw7E8THOIjGAldQJ4/O1y/NGQkkVlfwr
         rykf2JLxEUYjktnTyOnU3+rjSVZXYWc1KUyJtNdqJWe583UdCmNdS8LlTN3vlu+BjptP
         h1ynOYH0I3VepIqVTaeZEmc77HaD5VrcGKsfM=
MIME-Version: 1.0
Sender: jchris@gmail.com
In-Reply-To: <f334ade01001061257m7d67846lf71fc10da8f7e866@mail.gmail.com>
References: <f334ade01001061010ka48f05fh9f0cda30122b4a4c@mail.gmail.com>
	 <e282921e1001061048j60a23781h740c9078d423a050@mail.gmail.com>
	 <f334ade01001061110q55460c0cq1fcc6949d7ed355@mail.gmail.com>
	 <e282921e1001061239n74f270f7p7e259510fc5a5b7f@mail.gmail.com>
	 <f334ade01001061257m7d67846lf71fc10da8f7e866@mail.gmail.com>
Date: Wed, 6 Jan 2010 13:01:42 -0800
Message-ID: <e282921e1001061301m530231ddna043982d356adce8@mail.gmail.com>
Subject: Re: Building IFI View for Text Queries
From: Chris Anderson <jchris@apache.org>
To: user@couchdb.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Wed, Jan 6, 2010 at 12:57 PM, Nic Pottier <nicpottier@gmail.com> wrote:
> On Wed, Jan 6, 2010 at 12:39 PM, Chris Anderson <jchris@apache.org> wrote=
:
>>> Any way to get an insight as to how big the index is? =A0I can see how
>>> big my database is (78M with ~11k docs) but I'd be curious to know how
>>> big that view is stored in memory.
>>
>> The view is stored on disk. Look in the CouchDB data directory
>> /usr/local/var/lib/couchdb for the view directory.
>
> I only see the primary database file here, so I guess I get a feeling
> for the total size, but not what portion of that size is from the
> view. =A0I suppose I could delete it, look at the size, then rebuild,
> comparing the growth?

there is actually a separate index directory called

.my_db_name_design/

inside that directory. within it is 1 index file per design document.
that file size is the actual index size.


>
>> Our reduce is not key-bounded, so [id array] would end up being the
>> list of unique ids in the entire database for full-reduce.
>
> Ok, that's kind of what I suspected. =A0Are there any plans to offer
> multiple levels of mapping? =A0It seems like it would still fit into
> pattern of individual updates and tree aggregation and could allow for
> fast recreation of these kind of indexes. =A0Just a random question /
> idea..

we're definitely into alternate query engines. Lucene is pretty
popular with CouchDB, and the way it is kept up to date is the same
architecture as the "built-in" map reduce. This is also how you'd hook
up SQLite or Neo4j

>
>> The storage inefficiency you describe is likely what would force you
>> from a pure Couch to a Lucene FTI solution first, as your data begins
>> to scale.
>
> Understood.. I'll take another look at the Lucene integration again..
> how many people are using that?
>


--=20
Chris Anderson
http://jchrisa.net
http://couch.io