lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3097) Post grouping faceting
Date Sat, 12 Nov 2011 08:28:51 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149006#comment-13149006
] 

Martijn van Groningen commented on LUCENE-3097:
-----------------------------------------------

Well the code that got committed only creates facets for the most relevant document per group.
This isn't really grouped facets. To implement this we need to modify Solr's faceting code
/ facet module code. So I think we can close this one and open a Solr issue to implement grouped
facets in Solr (I do have some code for this, but it isn't perfect...) and maybe also an issue
to add this to the faceting module
                
> Post grouping faceting
> ----------------------
>
>                 Key: LUCENE-3097
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3097
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/grouping
>            Reporter: Martijn van Groningen
>            Assignee: Martijn van Groningen
>            Priority: Minor
>             Fix For: 3.5, 4.0
>
>         Attachments: LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch, LUCENE-3097.patch,
LUCENE-3097.patch, LUCENE-30971.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet value. Matrix
counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first option it
calculates a DocSet based on the individual documents from the query result. For the second
option it calculates a DocSet for all the most relevant documents of a group. Once the DocSet
is computed the FacetComponent and StatsComponent use one the DocSet to create facets and
statistics.  
> This last one is a bit more complex. I think it is best explained with an example. Lets
say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect (according to
my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get counts AMS:3
and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message