commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pascal Schumacher (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (LANG-1199) Fix implementation of StringUtils.getJaroWinklerDistance()
Date Sun, 05 Jun 2016 15:40:59 GMT

     [ https://issues.apache.org/jira/browse/LANG-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pascal Schumacher resolved LANG-1199.
-------------------------------------
       Resolution: Fixed
         Assignee: Pascal Schumacher
    Fix Version/s: 3.5

Fixed by replacing the current implementation with the one of Apache Lucene. Thanks for reporting.

> Fix implementation of StringUtils.getJaroWinklerDistance()
> ----------------------------------------------------------
>
>                 Key: LANG-1199
>                 URL: https://issues.apache.org/jira/browse/LANG-1199
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*
>    Affects Versions: 3.4
>            Reporter: M. Steiger
>            Assignee: Pascal Schumacher
>             Fix For: 3.5
>
>
> The current implementation of StringUtils.getJaroWinklerDistance() does not compute the
correct result in some cases. See #LANG-944 for the initial code contribution.
> StringUtils.getJaroWinklerDistance("Haus Ingeborg", "Ingeborg Esser") == 0.0
> This is due to the incorrect computation of common characters, which causes the algorithm
to exit prematurely.
> In contrast, the implementation in Lucene gives ~0.63, which is about right.
>     JaroWinklerDistance d = new JaroWinklerDistance();
>     getDistance("Haus Ingeborg", "Ingeborg Esser");
> See https://lucene.apache.org/core/3_0_3/api/contrib-spellchecker/org/apache/lucene/search/spell/JaroWinklerDistance.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message