lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2140) TopTermsScoringBooleanQueryRewrite minscore
Date Sat, 12 Dec 2009 15:39:18 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789754#action_12789754
] 

Michael McCandless commented on LUCENE-2140:
--------------------------------------------

minCompetitiveBoost?  minRequiredBoost?

> TopTermsScoringBooleanQueryRewrite minscore
> -------------------------------------------
>
>                 Key: LUCENE-2140
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2140
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>            Priority: Minor
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2140.patch
>
>
> when using the TopTermsScoringBooleanQueryRewrite (LUCENE-2123), it would be nice if
MultiTermQuery could set an attribute specifying the minimum required score once the Priority
Queue is filled. 
> This way, FilteredTermsEnums could adjust their behavior accordingly based on the minimal
score needed to actually be a useful term (i.e. not just pass thru the pq)
> An example is FuzzyTermsEnum: at some point the bottom of the priority queue contains
words with edit distance of 1 and enumerating any further terms is simply a waste of time.
> This is because terms are compared by score, then termtext. So in this case FuzzyTermsEnum
could simply seek to the exact match, then end.
> This behavior could be also generalized for all n, for a different impl of fuzzyquery
where it is only looking in the term dictionary for words within edit distance of n' which
is the lowest scoring term in the pq (they adjust their behavior during enumeration of the
terms depending upon this attribute).
> Other FilteredTermsEnums could make use of this minimal score in their own way, to drive
the most efficient behavior so that they do not waste time enumerating useless terms.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message