lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3821) SloppyPhraseScorer sometimes misses documents that ExactPhraseScorer finds.
Date Sat, 10 Mar 2012 15:04:58 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226873#comment-13226873
] 

Robert Muir commented on LUCENE-3821:
-------------------------------------

{quote}
Robert, this gave me an idea... currently, in case of "collision" between repeaters, we compare
them and advance the "lesser" of them (SloppyPhraseScorer.lesser(PhrasePositions, PhrasePositions))
- it should be fairly easy to add lookahead to this logic: if one of the two is multi-term,
lesser can also do a lookahead. The amount of lookahead can depend on the slop. I'll give
it a try before closing this issue.
{quote}

Interesting... its hard to think about for me since the edit distance is a little different,
but at least in the
levAutomata case the maximum 'context' the thing ever needs is {{2n+1}}, where n is the distance/slop.
I don't 
know if it applies here... but seems like it should be at least an upperbound.

Speaking of which on a related note, I think its possible we can implement a more... exhaustive
test for 
SloppyPhraseScorer (rather than relying so much on a random one). The idea would be a twist
on 
TestLevenshteinAutomata.assertCharVectors: using an alphabet of terms={0,1} the idea is to
test all possible
'automaton structures', for sloppyphrasescorer, the idea would be we have the minimal test
method that
tests all the cases...

I'll think on this one...


                
> SloppyPhraseScorer sometimes misses documents that ExactPhraseScorer finds.
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-3821
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3821
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.5, 4.0
>            Reporter: Naomi Dushay
>            Assignee: Doron Cohen
>         Attachments: LUCENE-3821-SloppyDecays.patch, LUCENE-3821.patch, LUCENE-3821.patch,
LUCENE-3821.patch, LUCENE-3821.patch, LUCENE-3821_test.patch, schema.xml, solrconfig-test.xml
>
>
> The general bug is a case where a phrase with no slop is found,
> but if you add slop its not.
> I committed a test today (TestSloppyPhraseQuery2) that actually triggers this case,
> jenkins just hasn't had enough time to chew on it.
> ant test -Dtestcase=TestSloppyPhraseQuery2 -Dtests.iter=100 is enough to make it fail
on trunk or 3.x

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message