commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Baker (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-1011) Create a new class StringDistance as host for the getXXDistance methods in StringUtils
Date Sat, 28 Feb 2015 15:01:04 GMT

    [ https://issues.apache.org/jira/browse/LANG-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341562#comment-14341562
] 

Jonathan Baker commented on LANG-1011:
--------------------------------------

1. Is org.apache.commons.lang3.text.StringDistances a good place to move these functions?

2. Should the corresponding changes also be made in the 2.x version?  The [release plan](https://issues.apache.org/jira/browse/LANG?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel#pd-desc)
says no, but please confirm.

3. Would it make sense (maybe for lang 4 since java 8 is required) to create a StringDistance<DISTANCE>
interface that extends [BiFunction<CharSequence, CharSequence, DISTANCE>](http://docs.oracle.com/javase/8/docs/api/java/util/function/BiFunction.html)?

    // For example:

    public interface StringDistance<DISTANCE> extends BiFunction<CharSequence, CharSequence,
DISTANCE> {

        public DISTANCE apply( CharSequence t, CharSequence u );

    }

    public class LevenshteinDistance implements StringDistance<Integer> {

        private final Integer threshold;

        public LeveshteinDistance() { ... }

        public LevenshteinDistance( final int threshold ) { ... }

        public Integer apply( CharSequence t, CharSequence u ) {
            // Would two Leveshtein classes be better than the null check?
            if (threshold == null) {
                return getDistance( t, u );
            } else {
                return getDistance( t, u, threshold );
            }
        }

        public static Integer getDistance( CharSequence t, CharSequence u ) { ... }

        public static Integer getDistance( CharSequence t, CharSequence u, int threshold )
{ ... }

    }

> Create a new class StringDistance as host for the getXXDistance methods in StringUtils
> --------------------------------------------------------------------------------------
>
>                 Key: LANG-1011
>                 URL: https://issues.apache.org/jira/browse/LANG-1011
>             Project: Commons Lang
>          Issue Type: New Feature
>          Components: lang.*
>            Reporter: Benedikt Ritter
>            Assignee: Benedikt Ritter
>             Fix For: 3.4
>
>
> We're getting more and more algorithms that calculate distances between strings, so it
makes sense to create a new class for this kind of logic.
> deprecate getLevenshteinDistance and getJaroWinklerDistance and delegate to the new class.
If the new class is implemented in 3.4, move getFuzzyDistance (is has not yet been released)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message