lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Updated: (LUCENE-2690) Do MultiTermQuery boolean rewrites per segment
Date Fri, 08 Oct 2010 20:52:30 GMT


Uwe Schindler updated LUCENE-2690:

    Attachment: LUCENE-2690.patch

Updated patch, that also checks for duplicate terms in the fuzzy rewrite. This should be fine
now, but we need to fix the FuzzyQuery tests to checks for multiple segments with the same
terms that should fail with this patch.

Maybe we need a separate MTQ tests that creates two IndexWriters which add documents with
an overlapping term set to both indexes. Queries are then ran using MzultiReader, so we can
control merging and make sure the term appears really in two "segments". I will work on a
test for that.

> Do MultiTermQuery boolean rewrites per segment
> ----------------------------------------------
>                 Key: LUCENE-2690
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 4.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>         Attachments: LUCENE-2690.patch, LUCENE-2690.patch
> MultiTermQuery currently rewrites FuzzyQuery (using TopTermsBooleanQueryRewrite), the
auto constant rewrite method and the ScoringBQ rewrite methods using a MultiFields wrapper
on the top-level reader. This is inefficient.
> This patch changes the rewrite modes to do the rewrites per segment and uses some additional
datastructures (hashed sets/maps) to exclude duplicate terms. All tests currently pass, but
FuzzyQuery's tests should not, because it depends for the minimum score handling, that the
terms are collected in order..
> Robert will fix FuzzyQuery in this issue, too. This patch is just a start.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message