lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (JIRA)" <>
Subject [jira] Updated: (LUCENE-1310) Phrase query with term repeated 3 times requires more slop than expected
Date Thu, 26 Jun 2008 18:05:45 GMT


Doron Cohen updated LUCENE-1310:

    Attachment: LUCENE-1310.patch

Patch with a fix.

Problem was in the logic for advancing PhrasePositions that were pointing to exactly the same
term in the document.
Note that this advancing is required to avoid false matches (as was fixed in LUCENE-736).
However must first advance the PhrasePosition whose offset (in the query) is the highest.

As a side effect of this fix sorting of the "repeats" array (at scorer initialization) is
no longer required.

Grant's test is also in the patch, slightly modified.

Grant, can you give it a try and report here?

> Phrase query with term repeated 3 times requires more slop than expected
> ------------------------------------------------------------------------
>                 Key: LUCENE-1310
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.3.1, 2.3.2
>            Reporter: Grant Glouser
>            Assignee: Doron Cohen
>         Attachments: LUCENE-1310.patch,
> Consider a document with the text "A A A".
> The phrase query "A A A" (exact match) succeeds.
> The query "A A A"~1 (same document and query, just increasing the slop value by one)
> "A A A"~2 succeeds again.
> If the exact match succeeds, I wouldn't expect the same query but with more slop to fail.
 The fault seems to require some term to be repeated at least three times in the query, but
the three occurrences do not need to be adjacent.  I will attach a file that contains a set
of JUnit tests that demonstrate what I mean.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message