commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Tompkins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEXT-32) We want a wider variety of edit distances/similarity scores.
Date Mon, 19 Dec 2016 15:04:58 GMT

    [ https://issues.apache.org/jira/browse/TEXT-32?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761359#comment-15761359
] 

Rob Tompkins commented on TEXT-32:
----------------------------------

Based upon: 

http://markmail.org/message/gjhgz2udjbt3o7ih?q=%5Btext%5D%5BTEXT-32%5D

we should consider:

# LongestCommonSubstring (or subword).
# Biojava
# Talend

with the reference:

https://github.com/tdebatty/java-string-similarity

> We want a wider variety of edit distances/similarity scores.
> ------------------------------------------------------------
>
>                 Key: TEXT-32
>                 URL: https://issues.apache.org/jira/browse/TEXT-32
>             Project: Commons Text
>          Issue Type: New Feature
>            Reporter: Rob Tompkins
>            Assignee: Rob Tompkins
>
> Currently we have:
> {code}
> CosineDistance.java
> CosineSimilarity.java
> FuzzyScore.java
> HammingDistance.java
> JaccardDistance.java
> JaccardSimilarity.java
> JaroWinklerDistance.java
> LevenshteinDetailedDistance.java
> LevenshteinDistance.java
> LevenshteinResults.java
> {code}
> We wish to have a larger list of edit distances than this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message