commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Tompkins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEXT-104) Jaro Winkler Distance refers to similarity
Date Fri, 08 Dec 2017 19:19:00 GMT

    [ https://issues.apache.org/jira/browse/TEXT-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284069#comment-16284069
] 

Rob Tompkins commented on TEXT-104:
-----------------------------------

The conventional name for this string similarity score in the literature is the "Jaro-Winkler
Distance" despite that it does not satisfy the triangle inequality. Unfortunately, that puts
the misnomer here outside of our control. 

That said, what would your thought be here? Should we name the class {{JaroWinklerDistanceSimilarityScore}}?
I'm definitely open to suggestion on how to more effectively convey that the "Jaro-Winkler
Distance" is indeed not a mathematical metric.

> Jaro Winkler Distance refers to similarity
> ------------------------------------------
>
>                 Key: TEXT-104
>                 URL: https://issues.apache.org/jira/browse/TEXT-104
>             Project: Commons Text
>          Issue Type: Improvement
>    Affects Versions: 1.1
>            Reporter: Nikos Karagiannakis
>            Priority: Trivial
>
> The 'apply' method returns the similarity score instead of the distance score as implied
from the class name. 
> It is stated in the javadoc, but it is not aligned with the approach of the rest similarity
scores in the same package (e.g LevenshteinDetailedDistance). 
> Maybe a rename of the class or the method to avoid confusion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message