lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "German Attanasio Ruiz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-236) Field collapsing
Date Tue, 17 Nov 2009 19:20:41 GMT

    [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779061#action_12779061
] 

German Attanasio Ruiz commented on SOLR-236:
--------------------------------------------

Sorting of results doesn't work properly. Next, I detail the steps I followed and the problem
I faced

I am using solr as a search engine for web pages, from which I use  a field named "site" for
collapsing and sort over scord

Steps
After downloading the last version of solr "solr-2009-11-15" and applying the patch "field-collapse-5.patch
2009-11-15 08:55 PM Martijn van Groningen 239 kB"

STEP 1 - I make a search using fieldcollapsing and the result is correct, the number with
greatest scord is 0.477
STEP 2 - I make the same search and the fieldcollapsing throws other result with scord 0.17,
the (correct) result of step 1 does not appear again

Possible problem
Step 1 stores the document in the cache for future searches
at Step 2 the search is don over the cache and does not find the previously stored document

Possible solution
I believe that the problem is in the storing of the document in the cache since if we make
step 2 again we have the same result and the document with scord of 0.17 is not removed from
the results, the only result removed is the document with scord 0.477

Conclusion
Documents are not sorted properly when using "fieldcollapsing + solrcache", that is when documents
stored in solr cache are required

> Field collapsing
> ----------------
>
>                 Key: SOLR-236
>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Emmanuel Keller
>             Fix For: 1.5
>
>         Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch,
collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch,
field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch,
SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given field to
a single entry in the result set. Site collapsing is a special case of this, where all results
for a given web site is collapsed into one or two entries in the result set, typically with
an associated "more documents from this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message