lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3097) Post grouping faceting
Date Sat, 04 Jun 2011 11:05:48 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044269#comment-13044269
] 

Martijn van Groningen commented on LUCENE-3097:
-----------------------------------------------

bq Also, this patch won't properly count facets if the field ever has multiple values within
one group
That is true. If facet values are different within a group the current collectors in the patch
won't notice that.
For the case Bill is describing that facets work as expected with the current patch.

bq. But maybe that's fine for the first go.... progress not perfection.
Definitely! But to continue I think we need the facet module.

> Post grouping faceting
> ----------------------
>
>                 Key: LUCENE-3097
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3097
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>            Assignee: Martijn van Groningen
>            Priority: Minor
>             Fix For: 3.3
>
>         Attachments: LUCENE-3097.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet value. Matrix
counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first option it
calculates a DocSet based on the individual documents from the query result. For the second
option it calculates a DocSet for all the most relevant documents of a group. Once the DocSet
is computed the FacetComponent and StatsComponent use one the DocSet to create facets and
statistics.  
> This last one is a bit more complex. I think it is best explained with an example. Lets
say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect (according to
my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get counts AMS:3
and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message