lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Sturlese <marc.sturl...@gmail.com>
Subject Re: [jira] Issue Comment Edited: (SOLR-236) Field collapsing
Date Mon, 07 Dec 2009 15:12:23 GMT

><lst name="collapse_counts">
>   <str name="field">cat</str>
>    <lst name="results">
>        <lst name="009">
>            <str name="fieldValue">hard</str>
>           <int name="collapseCount">1</int>
>            <result name="collapsedDocs" numFound="1" start="0">
>                 <doc>
>                    <long name="id">008</long>
>                    <str name="content">aaa aaa</str>
>                    <str name="col">ccc</str>
>                 </doc>
>            </result>
>        </lst>
>        ...
>    </lst>
></lst>
I see, looks like I am applying the patch wrongly somehow.
This the complete collapse_counts response I am getting:
<lst name="collapse_counts">
  <str name="field">col</str>
  <lst name="results">
    <lst>
      <int name="collapseCount">1</int>
      <int name="collapseCount">1</int>
      <int name="collapseCount">1</int>
      <str name="fieldValue">bbb</str>
      <str name="fieldValue">ccc</str>
      <str name="fieldValue">xxx</str>
      <result name="collapsedDocs" numFound="1" start="0">
        <doc>
          <long name="id">2</long>
          <str name="content">aaa aaa</str>
          <str name="col">bbb</str>
        </doc>
      </result>
      <result name="collapsedDocs" numFound="1" start="0">
        <doc>
          <long name="id">8</long>
          <str name="content">aaa aaa aaa sd</str>
          <str name="col">ccc</str>
       </doc>
      </result>
      <result name="collapsedDocs" numFound="4" start="0">
        <doc>
          <long name="id">12</long>
          <str name="content">aaa aaa aaa v</str>
          <str name="col">xxx</str>
        </doc>
      </result>
    </lst>
  </lst>
</lst>

As you can see I am getting a <lst> tag with no name. As I understood what
you told me. I should be getting as many lst tags as collapsed groups and
the name attribute of the lst should be the unique field value. So, if the
patch was applyed correcly teh response should look like:

<lst name="collapse_counts">
  <str name="field">col</str>
  <lst name="results">
    <lst name="354> (the head value of the collapsed group)
      <int name="collapseCount">1</int>
      <str name="fieldValue">bbb</str>
      <result name="collapsedDocs" numFound="1" start="0">
        <doc>
          <long name="id">2</long>
          <str name="content">aaa aaa</str>
          <str name="col">bbb</str>
        </doc>
      </result>
    </lst>
    <lst name="654">
      <int name="collapseCount">1</int>
      <str name="fieldValue">ccc</str>
      <result name="collapsedDocs" numFound="1" start="0">
        <doc>
          <long name="id">8</long>
          <str name="content">aaa aaa aaa sd</str>
          <str name="col">ccc</str>
       </doc>
      </result>
    </lst>
    <lst name="654">
      <int name="collapseCount">1</int>
      <str name="fieldValue">xxx</str>
      <result name="collapsedDocs" numFound="4" start="0">
        <doc>
          <long name="id">12</long>
          <str name="content">aaa aaa aaa v</str>
          <str name="col">xxx</str>
        </doc>
      </result>
    </lst>
  </lst>
</lst>

Is this the way the response looks like when you use teh patch?
Thanks in advance


Martijn v Groningen wrote:
> 
> Hi Marc,
> 
> I'm not sure if I follow you completely, but the example you gave is
> not complete. I'm missing a few tags in your example. Lets assume the
> following response that the latest patches produce.
> 
> <lst name="collapse_counts">
>     <str name="field">cat</str>
>     <lst name="results">
>         <lst name="009">
>             <str name="fieldValue">hard</str>
>             <int name="collapseCount">1</int>
>             <result name="collapsedDocs" numFound="1" start="0">
>                  <doc>
>                     <long name="id">008</long>
>                     <str name="content">aaa aaa</str>
>                     <str name="col">ccc</str>
>                  </doc>
>             </result>
>         </lst>
>         ...
>     </lst>
> </lst>
> 
> The result list contains collapse groups. The name of the child
> elements are the collapse head ids. Everything that falls under the
> collapse head belongs to that collapse group and thus adding document
> head id to the field value is unnecessary.  In the above example
> document with id 009 is the document head of document with id 008.
> Document with id 009 should be displayed in the search result.
> 
> From what you have said, it seems that you properly configured the patch.
> 
> Martijn
> 
> 2009/12/7 Marc Sturlese <marc.sturlese@gmail.com>:
>>
>> Hey there, I have beeb testing the last patch and I think or I am missing
>> something or the way to show the collapsed documents when adjacent
>> collapse
>> can be sometimes confusing:
>> I am using the patch replacing queryComponent for collapseComponent (not
>> using both at same time):
>>  <searchComponent name="query"
>> class="org.apache.solr.handler.component.CollapseComponent">
>> What I have noticed is, imagin you get these results in the search:
>> doc1:
>>   id:001
>>   collapseField:ccc
>> doc2:
>>   id:002
>>   collapseField:aaa
>> doc3:
>>   id:003
>>   collapseField:ccc
>> doc4:
>>   id:004
>>   collapseField:bbb
>>
>> And in the collapse_counts you get:
>> <int name="collapseCount">1</int>
>> <str name="fieldValue">ccc</str>
>> <result name="collapsedDocs" numFound="1" start="0">
>> <doc>
>> <long name="id">008</long>
>> <str name="content">aaa aaa</str>
>> <str name="col">ccc</str>
>> </doc>
>> </result>
>>
>> Now, how can I know the head document of doc 008? Both 001 and 003 could
>> be... wouldn't make sense to connect in someway  the uniqueField with the
>> collapsed documents?
>>
>> Adding something to collapse_counts like:
>> <int name="collapseCount">1</int>
>> <str name="fieldValue">ccc</str>
>> <str name="uniqueFieldId">003</str>
>>
>> I currently have hacked FieldValueCountCollapseCollectorFactory to
>> return:
>> <str name="fieldValue">ccc#003</str>
>> but this respose looks dirty...
>>
>> As I said maybe I am missunderstanding something and this can be knwon in
>> someway. In that case can someone tell me how?
>> Thanks in advance
>>
>>
>>
>>
>>
>>
>> JIRA jira@apache.org wrote:
>>>
>>>
>>>     [
>>> https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783484#action_12783484
>>> ]
>>>
>>> Martijn van Groningen edited comment on SOLR-236 at 11/29/09 9:56 PM:
>>> ----------------------------------------------------------------------
>>>
>>> I have attached a new patch that has the following changes:
>>> # Added caching for the field collapse functionality. Check the [solr
>>> wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure
>>> field-collapsing with caching.
>>> # Removed the collapse.max parameter (collapse.threshold must be used
>>> instead). It was deprecated for a long time.
>>>
>>>       was (Author: martijn):
>>>     I have attached a new patch that has the following changes:
>>> # Added caching for the field collapse functionality. Check the [solr
>>> wiki|http://wiki.apache.org/solr/FieldCollapsing] for how to configure
>>> the
>>> field-collapsing with caching.
>>> # Removed the collapse.max parameter (collapse.threshold must be used
>>> instead). It was deprecated for a long time.
>>>
>>>> Field collapsing
>>>> ----------------
>>>>
>>>>                 Key: SOLR-236
>>>>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>>>>             Project: Solr
>>>>          Issue Type: New Feature
>>>>          Components: search
>>>>    Affects Versions: 1.3
>>>>            Reporter: Emmanuel Keller
>>>>             Fix For: 1.5
>>>>
>>>>         Attachments: collapsing-patch-to-1.3.0-dieter.patch,
>>>> collapsing-patch-to-1.3.0-ivan.patch,
>>>> collapsing-patch-to-1.3.0-ivan_2.patch,
>>>> collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
>>>> field-collapse-4-with-solrj.patch, field-collapse-5.patch,
>>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>>> field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch,
>>>> field-collapse-solr-236-2.patch, field-collapse-solr-236.patch,
>>>> field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch,
>>>> field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff,
>>>> field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff,
>>>> quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch,
>>>> SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
>>>> solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>>>>
>>>>
>>>> This patch include a new feature called "Field collapsing".
>>>> "Used in order to collapse a group of results with similar value for a
>>>> given field to a single entry in the result set. Site collapsing is a
>>>> special case of this, where all results for a given web site is
>>>> collapsed
>>>> into one or two entries in the result set, typically with an associated
>>>> "more documents from this site" link. See also Duplicate detection."
>>>> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
>>>> The implementation add 3 new query parameters (SolrParams):
>>>> "collapse.field" to choose the field used to group results
>>>> "collapse.type" normal (default value) or adjacent
>>>> "collapse.max" to select how many continuous results are allowed before
>>>> collapsing
>>>> TODO (in progress):
>>>> - More documentation (on source code)
>>>> - Test cases
>>>> Two patches:
>>>> - "field_collapsing.patch" for current development version
>>>> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
>>>> P.S.: Feedback and misspelling correction are welcome ;-)
>>>
>>> --
>>> This message is automatically generated by JIRA.
>>> -
>>> You can reply to this email to add a comment to the issue online.
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26674651.html
>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> Met vriendelijke groet,
> 
> Martijn van Groningen
> 
> 

-- 
View this message in context: http://old.nabble.com/-jira--Created%3A-%28SOLR-236%29-Field-collapsing-tp10440315p26678606.html
Sent from the Solr - Dev mailing list archive at Nabble.com.


Mime
View raw message