lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shalin Shekhar Mangar (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SOLR-1682) Implement CollapseComponent
Date Wed, 30 Dec 2009 10:41:30 GMT

     [ https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shalin Shekhar Mangar updated SOLR-1682:
----------------------------------------

    Attachment: SOLR-236.patch

Here's an implementation based on [Yonik's suggestion|https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12792916&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12792916].

This is just a PoC and not fit to be committed. This implementation uses one pass for collapse.threshold=1
and two passes for collapse.threshold>1 so it should be a lot faster than the previous
method. Though, I haven't benchmarked yet. Memory consumption should be proportional to start+count
instead of index size.

What is covered:
# Non-adjacent collapsing
# collapse.threshold
# [New response format|https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12793101&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793101]
# Includes DocSetAwareCollector interface from SOLR-1680

What is not covered:
# Adjacent collapsing
# Aggregate functions (should be easy to add)
# Faceting (it doesn't keep/return the docsets needed for FacetComponent)
# Caching
# This implementation does not return the correct numFound

The response adds special fields to only the first document in a group. Here's a sample of
the first document in a group:
{code:xml}
<doc>
      <int name="id">1</int>
      <str name="name_s1">author1</str>
      <str name="title_s1">a tree</str>
      <date name="timestamp">2009-12-30T10:16:51.944Z</date>
      <arr name="multiDefault">
        <str>muLti-Default</str>
      </arr>
      <int name="intDefault">42</int>
      <str name="collapse.value">author1</str>
      <int name="collapse.count">1</int>
      <float name="score">0.67107505</float>
    </doc>
{code}

See TestCollapseComponent.java for example usage.

> Implement CollapseComponent
> ---------------------------
>
>                 Key: SOLR-1682
>                 URL: https://issues.apache.org/jira/browse/SOLR-1682
>             Project: Solr
>          Issue Type: Sub-task
>          Components: search
>            Reporter: Martijn van Groningen
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 1.5
>
>         Attachments: field-collapsing.patch, SOLR-236.patch
>
>
> Child issue of SOLR-236. This issue is dedicated to field collapsing in general and all
its code (CollapseComponent, DocumentCollapsers and CollapseCollectors). The main goal is
the finalize the request parameters and response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message