10 choose 2 is actually 45. But I miscalculated 10 choose 3, which
should be 120. The peak number for a 10tag doc is choosing 5, which
comes to 252. And this does look scary.
Thanks for the help. I came to understand that the view of couchdb,
has to index all these combinations in a B+ tree, if they are all
emitted. I'll have to remember that.
On Tue, May 24, 2011 at 8:08 AM, Patrick Barnes <mrtrick@gmail.com> wrote:
> On 23/05/2011 6:08 PM, He Shiming wrote:
>>
>
> How do you get 45? '10 choose 3' is 120.
>
> Also, you don't expect to have to match a smaller number of tags? If you
> only emit 3tag combinations, how would you match a 2tag intersection? With
> use of startkey and endkey, you could slightly reduce the required records 
> but this would only work for the tags that always get sorted to the front.
> (eg intersection of ['a' and 'b'], not ['b' and 'f'])
>
> I'm just saying that if you have millions of documents, you'll have a view
> with more than a hundred million rows in it  so it will take a long time to
> generate, and take up a lot more disk space than the equivalent lucene view.
> At an estimate, I'd say generation speed would be much slower than lucene.
> Retrieval speed might be a little bit faster, but queries like intersections
> are something lucene is built for.
>
> Patrick
>

Best regards,
He Shiming
