lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4282) Automaton Fuzzy Query doesn't deliver all results
Date Thu, 02 Aug 2012 12:14:02 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427262#comment-13427262
] 

Uwe Schindler commented on LUCENE-4282:
---------------------------------------

Thanks for help. We are starting to investigate what's wrong!

I did another test in parallel:
{code:java}
query.setRewriteMethod(FuzzyQuery.SCORING_BOOLEAN_QUERY_REWRITE);
{code}

With that one it is also failing, so the boost attribute itsself is not the problem. Because
this rewrite method does not use it at all (no PriorityQueue).

Also the Automaton is correct, if you pass the terms to the automaton, they all pass:

{code:java}
LevenshteinAutomata builder = new LevenshteinAutomata("EBER", true);
Automaton a = builder.toAutomaton(2);
a = BasicOperations.concatenate(BasicAutomata.makeChar('W'), a);
System.out.println(BasicOperations.run(a, "WBR"));
System.out.println(BasicOperations.run(a, "WEB"));
System.out.println(BasicOperations.run(a, "WEBE"));
System.out.println(BasicOperations.run(a, "WEBER"));
{code}
                
> Automaton Fuzzy Query doesn't deliver all results
> -------------------------------------------------
>
>                 Key: LUCENE-4282
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4282
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 4.0-ALPHA
>            Reporter: Johannes Christen
>            Assignee: Robert Muir
>              Labels: newbie
>         Attachments: ModifiedFuzzyTermsEnum.java, ModifiedFuzzyTermsEnum.java
>
>
> Having a small index with n documents where each document has one of the following terms:
> WEBER, WEBE, WEB, WBR, WE, (and some more)
> The new FuzzyQuery (Automaton) with maxEdits=2 only delivers the expected terms WEBER
and WEBE in the rewritten query. The expected terms WEB and WBR which have an edit distance
of 2 as well are missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message