lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (SOLR-2068) Search Grouping: collapse by string specialization
Date Fri, 29 Oct 2010 14:24:20 GMT


Yonik Seeley commented on SOLR-2068:

Going back over my old notes on how to efficiently do a string field per-segment:

 - Basically, hash based on ord (or a direct index lookup if the # of ords is small enough).
 We don't look up the value of the string at this point.
 - When a segment changes, we need to convert the ords from the old segment to the new segment
(i.e. look up it's value in the old segment, and find the ord of that in the new segment).
   - if the group value is not found in the new segment, the remove it from the hash.  Keep
it in the ordered map since it can still be pushed out by other insertions.

Phase 2:
 - at the start of each segment, look up the ords for the values and hash the group based
on that ord (or leave it out of the hash if it didn't exist in that segment).

Martijn's optimization in SOLR-2205 probably made Phase1 less important (except if there are
very few unique groups), so perhaps we should start with Phase2 first.

> Search Grouping: collapse by string specialization
> --------------------------------------------------
>                 Key: SOLR-2068
>                 URL:
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Yonik Seeley
> Create specialized implementations for collapsing by an indexed string field.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message