lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martijn van Groningen (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Date Sun, 22 Nov 2009 22:06:39 GMT

    [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12781232#action_12781232
] 

Martijn van Groningen edited comment on SOLR-236 at 11/22/09 10:06 PM:
-----------------------------------------------------------------------

The reason why the search results after the first search were incorrect was, because the scores
were not preserved in the cache. The result of that was that the collapsing algorithm could
not properly group the documents into the collapse groups (the most relevant document per
document group could not be determined properly), because there was no score information when
retrieving the documents from cache (as DocSet in SolrIndexSearcher) . 

I made sure that in the attached patch the score is also saved in the cache, so the collapsing
algorithm can do its work properly when the documents are retrieved from the cache. Because
the scores are now stored with the cached documents the actual size of the filterCache in
memory will increase. 

      was (Author: martijn):
    The reason why the search results after the first search were incorrect was, because the
score was not preserved in the cache. The result of that was that the collapsing algorithm
could not properly group the documents into the collapse groups (the most relevant document
per document group could not be determined properly), because there was not score information
when retrieving the documents from cache (as DocSet in SolrIndexSearcher) . 

I made sure that in the attached patch the score is also saved in the cache, so the collapsing
algorithm can do its work properly when the documents are retrieved from the cache. Because
the scores are now stored with the cached documents the actual size of the filterCache in
memory will increase. 
  
> Field collapsing
> ----------------
>
>                 Key: SOLR-236
>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Emmanuel Keller
>             Fix For: 1.5
>
>         Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch,
collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch,
field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch,
SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsin
 g.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given field to
a single entry in the result set. Site collapsing is a special case of this, where all results
for a given web site is collapsed into one or two entries in the result set, typically with
an associated "more documents from this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message