lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Aggregating category hits
Date Mon, 15 May 2006 22:45:01 GMT

On May 15, 2006, at 5:07 PM, Marvin Humphrey wrote:
> If you needed to know not just the total number of hits, but the  
> number of hits in each "category", how would you handle that?
>
> For instance, a search for "egg" would have to produce the 20 most  
> relevant documents for "egg", but also a list like this:
>
>     Holiday & Seasonal / Easter     75
>     Books / Cooking                 52
>     Miscellaneous                   44
>     Kitchen Collectibles            43
>     Hobbies / Crafts                17
>     [...]
>
> It seems to me that you'd have to retrieve each hit's stored fields  
> and examine the contents of a "category" field.  That's a lot of  
> overhead.  Is there another way?

My first implementation of faceted browsing uses BitSet's that get  
pre-loaded for each category value (each unique term in a "category"  
field, for example).  And to intersect that with an actual Query, it  
gets run through the QueryFilter to get its BitSet and then AND'd  
together with each of the category BitSet's.  Sounds like a lot, but  
for my applications there are not tons of these BitSet's and the  
performance has been outstanding.  Now that I'm doing more with Solr,  
I'm beginning to leverage its amazing caching infrastructure and  
replacing BitSet's with DocSet's.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message