lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1300) Negative wildcard searches on MultiSearcher not eliminating correctly.
Date Sat, 07 Jun 2008 02:44:45 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603280#action_12603280
] 

Mark Miller commented on LUCENE-1300:
-------------------------------------

Looks like the test worked before because things were even (it would appear) worse - the bug
was that the multi term query was only expanded on the first index and then the resulting
query was used on all of the indexes. The issue introducing the bug you have found was an
attempt to fix this by expanding on each Reader and then attempting to make a single query
that works across each Reader. The strategy seems to work in non MUST_NOT cases, but the query
generated can just be wrong with a MUST_NOT occurrence. As you point out, the second index
doesn't even have to be empty, and the second -() clause does not even have to be empty either
- the generated query can still be wrong.

I don't see the obvious fix - somehow we need a query that expands against all of the subreaders
as if one reader, or the combine method has to figure this out...neither seem easy to me...


> Negative wildcard searches on MultiSearcher not eliminating correctly.
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-1300
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1300
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.1, 2.3, 2.3.1
>         Environment: Windows XP, cygwin.
>            Reporter: steve halsey
>            Priority: Minor
>         Attachments: TestMultiSearcherNegativeWildcardQueryExpansion.java, TestMultiSearcherNegativeWildcardQueryExpansionWorksWith151.java
>
>
>  If you do a search for a negative wildcard query on a MultiSearcher where one of the
 searchers is empty e.g. "lucene -bug*" the hits returned incorrectly include articles  with
words that should be eliminated, e.g. "bug" and "bugs".  This is because the  query expansion
is done on the index with docs in and the empty ndex *separately* and then combined as an
OR to be run on the MultiSearcher.  This incorrectly lets in docs that have the excluded wildcard
terms, e.g. "bug" nd bugs". This bug would also show up with two indexes full of docs, and
I can send a test to show that if required, but I think this test demonstrates the bug in
the implest way. 
> The attached class TestMultiSearcherNegativeWildcardQueryExpansion.java can be put in
with other tests in org.apache.lucene.search and run and will fail, showing the bug exists.
>  I have tested this bug with the currently unreleased 2.3.2 and the released 2.1 and
2.3.1 and it fails on all.
> With lucene-1.5-rc1 it passes (with mods to make it work with old API) see TestMultiSearcherNegativeWildcardQueryExpansionWorksWith151.java
attachment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message