lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3240) add spellcheck 'approximate collation count' mode
Date Mon, 11 Jun 2012 21:25:43 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293099#comment-13293099
] 

Robert Muir commented on SOLR-3240:
-----------------------------------

I think so! 

We could optimize it further by ensuring that we arent collecting scores or anything (e.g.
i think we should be wrapping something like a TotalHitCollector (http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollector.java?view=markup),
or just not wrapping any collector at all?

But this patch is probably a good improvement for the worst case.
                
> add spellcheck 'approximate collation count' mode
> -------------------------------------------------
>
>                 Key: SOLR-3240
>                 URL: https://issues.apache.org/jira/browse/SOLR-3240
>             Project: Solr
>          Issue Type: Improvement
>          Components: spellchecker
>            Reporter: Robert Muir
>         Attachments: SOLR-3240.patch
>
>
> SpellCheck's Collation in Solr is a way to ensure spellcheck/suggestions
> will actually net results (taking into account context like filtering).
> In order to do this (from my understanding), it generates candidate queries,
> executes them, and saves the total hit count: collation.setHits(hits).
> For a large index it seems this might be doing too much work: in particular
> I'm interested in ensuring this feature can work fast enough/well for autosuggesters.
> So I think we should offer an 'approximate' mode that uses an early-terminating
> Collector, collect()ing only N docs (e.g. n=1), and we approximate this result
> count based on docid space. 
> I'm not sure what needs to happen on the solr side (possibly support for custom collectors?),
> but I think this could help and should possibly be the default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message