lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-4282) Automaton Fuzzy Query doesn't deliver all results
Date Thu, 02 Aug 2012 08:56:02 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427184#comment-13427184
] 

Uwe Schindler edited comment on LUCENE-4282 at 8/2/12 8:54 AM:
---------------------------------------------------------------

This is caused by the rewrite method not FuzzyQuery itsself. The rewrite mode uses an internal
priority queue, where it collects all terms from the index, that match the levensthein distance.
If there are more terms available, some are dropped. This depends on their distance and other
factors. If you want to use a larger PQ, create a separate instance of the TopTermsScoringBooleanQueryRewrite,
giving a queue size.
                
      was (Author: thetaphi):
    This is caused by the rewrite method not FuzzyQuery itsself. The rewrite mode uses an
internal priority queue, where it collects all terms from the index, that match the levensthein
distance. If there are more terms available, some are dropped. This depends on their distance
and other factors. If you want to use a larger PQ, create a separate instance of the TopTermsRewriteMethod,
giving a queue size.
                  
> Automaton Fuzzy Query doesn't deliver all results
> -------------------------------------------------
>
>                 Key: LUCENE-4282
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4282
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 4.0-ALPHA
>            Reporter: Johannes Christen
>              Labels: newbie
>
> Having a small index with n documents where each document has one of the following terms:
> WEBER, WEBE, WEB, WBR, WE, (and some more)
> The new FuzzyQuery (Automaton) with maxEdits=2 only delivers the expected terms WEBER
and WEBE in the rewritten query. The expected terms WEB and WBR which have an edit distance
of 2 as well are missing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message