lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martijn v Groningen <martijn.is.h...@gmail.com>
Subject Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Date Mon, 07 Dec 2009 12:55:25 GMT
Hi Marc,

I'm not sure if I follow you completely, but the example you gave is
not complete. I'm missing a few tags in your example. Lets assume the
following response that the latest patches produce.

<lst name="collapse_counts">
    <str name="field">cat</str>
    <lst name="results">
        <lst name="009">
            <str name="fieldValue">hard</str>
            <int name="collapseCount">1</int>
            <result name="collapsedDocs" numFound="1" start="0">
                 <doc>
                    <long name="id">008</long>
                    <str name="content">aaa aaa</str>
                    <str name="col">ccc</str>
                 </doc>
            </result>
        </lst>
        ...
    </lst>
</lst>

The result list contains collapse groups. The name of the child
elements are the collapse head ids. Everything that falls under the
collapse head belongs to that collapse group and thus adding document
head id to the field value is unnecessary.  In the above example
document with id 009 is the document head of document with id 008.
Document with id 009 should be displayed in the search result.

>From what you have said, it seems that you properly configured the patch.

Martijn

2009/12/7 Marc Sturlese <marc.sturlese@gmail.com>:
>
> Hey there, I have beeb testing the last patch and I think or I am missing
> something or the way to show the collapsed documents when adjacent collapse
> can be sometimes confusing:
> I am using the patch replacing queryComponent for collapseComponent (not
> using both at same time):
>  <searchComponent name="query"
> class="org.apache.solr.handler.component.CollapseComponent">
> What I have noticed is, imagin you get these results in the search:
> doc1:
>   id:001
>   collapseField:ccc
> doc2:
>   id:002
>   collapseField:aaa
> doc3:
>   id:003
>   collapseField:ccc
> doc4:
>   id:004
>   collapseField:bbb
>
> And in the collapse_counts you get:
> <int name="collapseCount">1</int>
> <str name="fieldValue">ccc</str>
> <result name="collapsedDocs" numFound="1" start="0">
> <doc>
> <long name="id">008</long>
> <str name="content">aaa aaa</str>
> <str name="col">ccc</str>
> </doc>
> </result>
>
> Now, how can I know the head document of doc 008? Both 001 and 003 could
> be... wouldn't make sense to connect in someway  the uniqueField with the
> collapsed documents?
>
> Adding something to collapse_counts like:
> <int name="collapseCount">1</int>
> <str name="fieldValue">ccc</str>
> <str name="uniqueFieldId">003</str>
>
> I currently have hacked FieldValueCountCollapseCollectorFactory to return:
> <str name="fieldValue">ccc#003</str>
> but this respose looks dirty...
>
> As I said maybe I am missunderstanding something and this can be knwon in
> someway. In that case can someone tell me how?
> Thanks in advance
>
>
>
>
>
>
> JIRA jira@apache.org wrote:
>>
>>
>>     [
>> https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783484#action_12783484
>> ]
>>
>> Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 PM:
>> ----------------------------------------------------------------------
>>
>> I have attached a new patch that has the following changes:
>> # Added caching for the field collapse functionality. Check the [solr
>> wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure
>> field-collapsing with caching.
>> # Removed the collapse.max parameter (collapse.threshold must be used
>> instead). It was deprecated for a long time.
>>
>>       was (Author: martijn):
>>     I have attached a new patch that has the following changes:
>> # Added caching for the field collapse functionality. Check the [solr
>> wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure the
>> field-collapsing with caching.
>> # Removed the collapse.max parameter (collapse.threshold must be used
>> instead). It was deprecated for a long time.
>>
>>> Field collapsing
>>> ----------------
>>>
>>>                 Key: SOLR-236
>>>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>>>             Project: Solr
>>>          Issue Type: New Feature
>>>          Components: search
>>>    Affects Versions: 1.3
>>>            Reporter: Emmanuel Keller
>>>             Fix For: 1.5
>>>
>>>         Attachments: collapsing-patch-to-1.3.0-dieter.patch,
>>> collapsing-patch-to-1.3.0-ivan.patch,
>>> collapsing-patch-to-1.3.0-ivan_2.patch,
>>> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
>>> field-collapse-4-with-solrj.patch, field-collapse-5.patch,
>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch,
>>> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch,
>>> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
>>> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff,
>>> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch,
>>> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
>>> solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>>>
>>>
>>> This patch include a new feature called "Field collapsing".
>>> "Used in order to collapse a group of results with similar value for a
>>> given field to a single entry in the result set. Site collapsing is a
>>> special case of this, where all results for a given web site is collapsed
>>> into one or two entries in the result set, typically with an associated
>>> "more documents from this site" link. See also Duplicate detection."
>>> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
>>> The implementation add 3 new query parameters (SolrParams):
>>> "collapse.field" to choose the field used to group results
>>> "collapse.type" normal (default value) or adjacent
>>> "collapse.max" to select how many continuous results are allowed before
>>> collapsing
>>> TODO (in progress):
>>> - More documentation (on source code)
>>> - Test cases
>>> Two patches:
>>> - "field_collapsing.patch" for current development version
>>> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
>>> P.S.: Feedback and misspelling correction are welcome ;-)
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26674651.html
> Sent from the Solr - Dev mailing list archive at Nabble.com.
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Mime
View raw message