lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cédrik LIME (JIRA) <>
Subject [jira] Commented: (LUCENE-1183) TRStringDistance uses way too much memory (with patch)
Date Thu, 21 Feb 2008 15:07:22 GMT


Cédrik LIME commented on LUCENE-1183:

Well spotted Karl! My version is very similar to LUCENE-691, except I kept some smallish optimisations
out the the sake of readability. I'll incorporate some of his changes/ideas and publish a
new patch.
Can someone link those 2 issues together in the meantime? (There are too many options in the
drop-down; don't know which one to choose.)

> TRStringDistance uses way too much memory (with patch)
> ------------------------------------------------------
>                 Key: LUCENE-1183
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>    Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3
>            Reporter: Cédrik LIME
>            Priority: Minor
>         Attachments: FuzzyTermEnum.patch,, TRStringDistance.patch
>   Original Estimate: 0.17h
>  Remaining Estimate: 0.17h
> The implementation of TRStringDistance is based on version 2.1 of org.apache.commons.lang.StringUtils#getLevenshteinDistance(String,
String), which uses an un-optimized implementation of the Levenshtein Distance algorithm (it
uses way too much memory). Please see Bug 38911 (
for more information.
> The commons-lang implementation has been heavily optimized as of version 2.2 (3x speed-up).
I have reported the new implementation to TRStringDistance.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message