incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From He Shiming <heshim...@gmail.com>
Subject Re: Mapping multiple entries in an array field? (like tags)
Date Wed, 25 May 2011 03:32:09 GMT
10 choose 2 is actually 45. But I mis-calculated 10 choose 3, which
should be 120. The peak number for a 10-tag doc is choosing 5, which
comes to 252. And this does look scary.

Thanks for the help. I came to understand that the view of couchdb,
has to index all these combinations in a B+ tree, if they are all
emitted. I'll have to remember that.

On Tue, May 24, 2011 at 8:08 AM, Patrick Barnes <mrtrick@gmail.com> wrote:
> On 23/05/2011 6:08 PM, He Shiming wrote:
>>
>
> How do you get 45? '10 choose 3' is 120.
>
> Also, you don't expect to have to match a smaller number of tags? If you
> only emit 3-tag combinations, how would you match a 2-tag intersection? With
> use of startkey and endkey, you could slightly reduce the required records -
> but this would only work for the tags that always get sorted to the front.
> (eg intersection of ['a' and 'b'], not ['b' and 'f'])
>
> I'm just saying that if you have millions of documents, you'll have a view
> with more than a hundred million rows in it - so it will take a long time to
> generate, and take up a lot more disk space than the equivalent lucene view.
> At an estimate, I'd say generation speed would be much slower than lucene.
> Retrieval speed might be a little bit faster, but queries like intersections
> are something lucene is built for.
>
> -Patrick
>

-- 
Best regards,
He Shiming

Mime
View raw message