lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (Commented) (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3821) SloppyPhraseScorer sometimes misses documents that ExactPhraseScorer finds.
Date Sat, 10 Mar 2012 00:14:57 GMT


Doron Cohen commented on LUCENE-3821:

- r1299077  3x
- r1299112  trunk

bq. I would be glad to try out a nightly build with the patch as is against our tests - I
would be glad to get the 80% solution if it fixes my bug.

It's in now...

bq.  But I wonder if we can re-use even some of the math to redefine the problem more formally
to figure out what minimal state/lookahead we need for example...

Robert, this gave me an idea... currently, in case of "collision" between repeaters, we compare
them and advance the "lesser" of them (SloppyPhraseScorer.lesser(PhrasePositions, PhrasePositions))
- it should be fairly easy to add lookahead to this logic: if one of the two is multi-term,
lesser can also do a lookahead. The amount of lookahead can depend on the slop. I'll give
it a try before closing this issue.

> SloppyPhraseScorer sometimes misses documents that ExactPhraseScorer finds.
> ---------------------------------------------------------------------------
>                 Key: LUCENE-3821
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.5, 4.0
>            Reporter: Naomi Dushay
>            Assignee: Doron Cohen
>         Attachments: LUCENE-3821-SloppyDecays.patch, LUCENE-3821.patch, LUCENE-3821.patch,
LUCENE-3821.patch, LUCENE-3821.patch, LUCENE-3821_test.patch, schema.xml, solrconfig-test.xml
> The general bug is a case where a phrase with no slop is found,
> but if you add slop its not.
> I committed a test today (TestSloppyPhraseQuery2) that actually triggers this case,
> jenkins just hasn't had enough time to chew on it.
> ant test -Dtestcase=TestSloppyPhraseQuery2 -Dtests.iter=100 is enough to make it fail
on trunk or 3.x

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message