[ https://issues.apache.org/jira/browse/LUCENE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919496#action_12919496
]
Michael McCandless commented on LUCENE-2690:
--------------------------------------------
It'd be nice somehow to have MTQ.getTotalNumberOfTerms return the *unique* term count instead
of the total number of terms visited across all segments...
> Do MultiTermQuery boolean rewrites per segment
> ----------------------------------------------
>
> Key: LUCENE-2690
> URL: https://issues.apache.org/jira/browse/LUCENE-2690
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: 4.0
>
> Attachments: LUCENE-2690.patch, LUCENE-2690.patch
>
>
> MultiTermQuery currently rewrites FuzzyQuery (using TopTermsBooleanQueryRewrite), the
auto constant rewrite method and the ScoringBQ rewrite methods using a MultiFields wrapper
on the top-level reader. This is inefficient.
> This patch changes the rewrite modes to do the rewrites per segment and uses some additional
datastructures (hashed sets/maps) to exclude duplicate terms. All tests currently pass, but
FuzzyQuery's tests should not, because it depends for the minimum score handling, that the
terms are collected in order..
> Robert will fix FuzzyQuery in this issue, too. This patch is just a start.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
|