commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henri Yandell (JIRA)" <>
Subject [jira] [Updated] (LANG-684) Levenshtein Distance Within a Given Threshold
Date Sat, 04 Jun 2011 03:19:48 GMT


Henri Yandell updated LANG-684:

    Fix Version/s:     (was: 3.x)

I'm not against moving LD to Codec. It used to live with Soundex etc way back in the early
days of the codebases, but LD didn't really fit Codec and stayed with Lang. We need to decide
before 3.0; and ideally we would release a version in Codec fairly soon.

> Levenshtein Distance Within a Given Threshold
> ---------------------------------------------
>                 Key: LANG-684
>                 URL:
>             Project: Commons Lang
>          Issue Type: New Feature
>          Components: lang.*
>            Reporter: Eli Lindsey
>            Priority: Minor
>             Fix For: 3.0
>         Attachments: LevenshteinDistanceWithThreshold.patch
> It'd be nice to have a function that calculates the Levenshtein distance only if it's
within some integer threshold.  
> Oftentimes you care less about the actual LD and more about it being within a certain
range.  This common, limited computation can be performed much faster than the normal unbounded
LD method; instead of O(nm), you can do it in O(km) (n and m are string lengths, k is the
> Also, providing a function like this makes it easier for library users to rewrite the
unbounded Levenshtein function to run in O(dm) time (d is the edit distance) if necessary.
> I'm attaching a patch that implements this function and adds appropriate test cases.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message