lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramana Jelda" <ramana.je...@ciao-group.com>
Subject RE: Aggregating category hits
Date Tue, 16 May 2006 08:30:54 GMT
But this BitSet strategy is more memory consuming mainly if you have
documents in million numbers and categories in thousands.
So I preferred in my project FieldCache strategy.

Jelda

> -----Original Message-----
> From: Kapil Chhabra [mailto:kapil.chhabra@naukri.com] 
> Sent: Tuesday, May 16, 2006 7:38 AM
> To: java-user@lucene.apache.org
> Subject: Re: Aggregating category hits
> 
> Even I am doing the same in my application.
> Once in a day, all the filters [for different categories] are 
> initialized. Each time a query is fired, the Query BitSet is 
> ANDed with the BitSet of each filter. The cardinality 
> obtained is the desired output.
> @Eric: I would like to know more about the implementation 
> with DocSet in place of Bitset.
> 
> Regards,
> kapilChhabra
> 
> 
> Erik Hatcher wrote:
> >
> > On May 15, 2006, at 5:07 PM, Marvin Humphrey wrote:
> >> If you needed to know not just the total number of hits, but the 
> >> number of hits in each "category", how would you handle that?
> >>
> >> For instance, a search for "egg" would have to produce the 20 most 
> >> relevant documents for "egg", but also a list like this:
> >>
> >> Holiday & Seasonal / Easter 75
> >> Books / Cooking 52
> >> Miscellaneous 44
> >> Kitchen Collectibles 43
> >> Hobbies / Crafts 17
> >> [...]
> >>
> >> It seems to me that you'd have to retrieve each hit's 
> stored fields 
> >> and examine the contents of a "category" field. That's a lot of 
> >> overhead. Is there another way?
> >
> > My first implementation of faceted browsing uses BitSet's that get 
> > pre-loaded for each category value (each unique term in a "category"
> > field, for example). And to intersect that with an actual Query, it 
> > gets run through the QueryFilter to get its BitSet and then AND'd 
> > together with each of the category BitSet's. Sounds like a lot, but 
> > for my applications there are not tons of these BitSet's and the 
> > performance has been outstanding. Now that I'm doing more 
> with Solr, 
> > I'm beginning to leverage its amazing caching infrastructure and 
> > replacing BitSet's with DocSet's.
> >
> > Erik
> >
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message