commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Tompkins (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (TEXT-14) Create a generic class that calculates a distance based on a similarity score
Date Fri, 30 Dec 2016 15:43:58 GMT

    [ https://issues.apache.org/jira/browse/TEXT-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15787768#comment-15787768
] 

Rob Tompkins edited comment on TEXT-14 at 12/30/16 3:43 PM:
------------------------------------------------------------

I've got an idea here. 

{code}
Integer distance(String s1, String s2) {
     return s1.length() + s2.length() - 2*similarityScore(s1, s2);
}
{code}

Note that this assumes that a similarity score calculates the number similar characters between
two strings. For a similarity score that calculates a Double, we might need to think of something
that results in an edit distance that results in the distance being between 0 and 1. So we
have the notion of a percentage difference or something like that.


was (Author: chtompki):
I've got an idea here. 

{code}
Double distance(String s1, String s2) {
     return s1.length() + s2.length() - 2*similarityScore(s1, s2);
}
{code}

> Create a generic class that calculates a distance based on a similarity score
> -----------------------------------------------------------------------------
>
>                 Key: TEXT-14
>                 URL: https://issues.apache.org/jira/browse/TEXT-14
>             Project: Commons Text
>          Issue Type: Improvement
>            Reporter: Bruno P. Kinoshita
>            Priority: Minor
>              Labels: features, idea
>             Fix For: 1.x
>
>
> From http://markmail.org/message/lkqcrm3f3qbu5heu
> Seems like an interesting idea. Worth spending some time to investigate it later. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message