lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: [jira] Created: (SOLR-44) Basic Facet Count support
Date Thu, 31 Aug 2006 02:01:39 GMT

On Aug 29, 2006, at 9:57 PM, Hoss Man (JIRA) wrote:
> First pass at basic facet support.  initial patch includes  
> utilities for use in RequestHandlers, and usage in  
> StandardRequestHandler (DisMax should use SolrParams before  
> attempting to add this)
>
> Basic idea is that:
>   * facet=true indicates facet counts are desired.
>   * facetField=inStock indicates we want a count of the matching  
> docs for each value in the field inStock
>   * facetQuery=title:ipod indicates we want the count of matching  
> docs also in the set of docs matching query title:ipod
>   * if user wants to apply a facet constraint on subsequent  
> queries, they can add an "fq" (filter query) param (support for  
> this was added to StandardRequestHandler as well)
>
> Things marked TODO...
>   * add support for per field facetLimit indicating that only the  
> top N items in each facetField should be returned
>   * add support for a per field facetZero boolean indicating that  
> there is no reason to bother returning counts of 0 for facetFields  
> (some clients may want to know the list, others don't care)
>   * potential optimization when using faceLimit to cache the terms  
> with the highest docFreq and see if they provide all the info we  
> need without doing a full TermEnum
>
> I'd like to get some feedback on the overall appraoch and params  
> before i proceed too much farther.

Wow, Hoss.  Very cool.  I might be able to just rip out all the  
custom work I've done and go with a pure Solr build one of these days :)

One thing that my facet code does is compute the count for all items  
that have _no_ terms in a particular field, and makes an  
<unspecified> count as well.  It does this by putting all documents  
found into a DocSet as it iterates through all terms for a field, and  
then .andNot'ing it away from an all docs query.  Not pretty, but  
does work and works quite fast.

Do you think a catch all facet count could be added into your  
implementation somehow?

	Erik



Mime
View raw message