lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul taylor (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2557) FuzzyQuery - fuzzy terms and misspellings are ranked higher than exact matches
Date Fri, 09 Mar 2012 14:46:59 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226113#comment-13226113
] 

Paul taylor commented on LUCENE-2557:
-------------------------------------

I think you are missing point here whilst there may be some way I can achieve my aim by changing
indexing (heres an only slight outof date example using the actual code http://svn.musicbrainz.org/search_server/trunk/servlet/src/main/java/org/musicbrainz/search/servlet/DismaxQueryParser.java)
the fact is that it doesn't make sense to fully score fuzzy queries if your query contains
exact and fuzzy matches (the current default), and it doesnt make sense to use a constant
score either (your suggestion), a default that does score but uses the idf of the search term
would give much better default results.
                
> FuzzyQuery - fuzzy terms and misspellings are ranked higher than exact matches
> ------------------------------------------------------------------------------
>
>                 Key: LUCENE-2557
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2557
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/query/scoring
>    Affects Versions: 3.0.2
>            Reporter: Jingkei Ly
>         Attachments: LUCENE-2557.patch, idf-scoring-test-case.patch
>
>
> The FuzzyQuery often causes misspellings to be ranked higher than the exact match, which
seems to be an undesirable property generally. 
> For example, in an index of surnames, if I search using a FuzzyQuery for "smith", the
misspellings such as "smiith", or "smiht" would appear near the top of the search results
ahead of documents that match "smith".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message