lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Traeger (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-236) Field collapsing
Date Wed, 02 Sep 2009 20:51:33 GMT

    [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750658#action_12750658
] 

Thomas Traeger commented on SOLR-236:
-------------------------------------

Hi Martijn,

i also thought about changing the reponse format and introducing two new parameters "collapse.response"
and "collapse.response.fl".

What do you think of these values for "collapse.response":

"counts": the default and current behavior, maybe even current response format to provide
backward compatibility
"docs": returns the counts and the collapsed docs inside the collapse response (essentialy
instead of removing the doc from the result just move it from the result to the collapse response).
The parameter "collapse.response.fl" can be used to specify the field(s) to be returned in
the collapse response.

So starting with your proposal the new collapse reponse format might look like this:

{code:xml}
<lst name="collapse_counts">
    <str name="field">venue</str>
    <lst name="results">
        <lst name="233238">
            <str name="fieldValue">melkweg</str>
            <int name="collapseCount">2</int>
             <lst name="collapsedDocs">
                <doc>
                    <str name="id">233239</str>
                    <str name="name">Foo Bar</str>
                    ...
                </doc>
                <doc>
                    <str name="id">233240</str>
                    <str name="name">Foo Bar 2</str>
                    ...
                </doc>
            </lst>
        </lst>
    </lst>
</lst>
{code}

I think just moving the collapsed docs into the collapse response when desired provides us
the necessary flexibility and is hopefully easy to implement.

> Field collapsing
> ----------------
>
>                 Key: SOLR-236
>                 URL: https://issues.apache.org/jira/browse/SOLR-236
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 1.3
>            Reporter: Emmanuel Keller
>             Fix For: 1.5
>
>         Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch,
collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch,
field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch,
field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch,
field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff,
field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch,
SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch
>
>
> This patch include a new feature called "Field collapsing".
> "Used in order to collapse a group of results with similar value for a given field to
a single entry in the result set. Site collapsing is a special case of this, where all results
for a given web site is collapsed into one or two entries in the result set, typically with
an associated "more documents from this site" link. See also Duplicate detection."
> http://www.fastsearch.com/glossary.aspx?m=48&amid=299
> The implementation add 3 new query parameters (SolrParams):
> "collapse.field" to choose the field used to group results
> "collapse.type" normal (default value) or adjacent
> "collapse.max" to select how many continuous results are allowed before collapsing
> TODO (in progress):
> - More documentation (on source code)
> - Test cases
> Two patches:
> - "field_collapsing.patch" for current development version
> - "field_collapsing_1.1.0.patch" for Solr-1.1.0
> P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message