commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: [jira] [Closed] (CODEC-107) Enhance documentation for Language Encoders
Date Wed, 30 Mar 2011 03:20:17 GMT
On 30 March 2011 03:48, Gary D. Gregory (JIRA) <jira@apache.org> wrote:
>
>     [ https://issues.apache.org/jira/browse/CODEC-107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
>
> Gary D. Gregory closed CODEC-107.
> ---------------------------------
>
>    Resolution: Won't Fix

In that case, the "Fix for" version should be removed, no?

>
>> Enhance documentation for Language Encoders
>> -------------------------------------------
>>
>>                 Key: CODEC-107
>>                 URL: https://issues.apache.org/jira/browse/CODEC-107
>>             Project: Commons Codec
>>          Issue Type: Improvement
>>    Affects Versions: 1.4
>>            Reporter: Marc Pompl
>>            Priority: Minor
>>             Fix For: 1.5
>>
>>   Original Estimate: 1h
>>  Remaining Estimate: 1h
>>
>> The current userguide (http://commons.apache.org/codec/userguide.html) just lists
four Language Encoders, but there are five at the moment. CODEC-106 implements a sixth one.
>> Would be a good idea, to complete documentation.
>> Additionally, I suggest to extent the userguide in order to show a simple performance
measurement:
>> _SNIP_
>> org.apache.commons.codec.language.Metaphone encodings per msec: 327
>> org.apache.commons.codec.language.DoubleMetaphone encodings per msec: 224
>> org.apache.commons.codec.language.Soundex encodings per msec: 904
>> org.apache.commons.codec.language.RefinedSoundex encodings per msec: 637
>> org.apache.commons.codec.language.Caverphone encodings per msec: 5
>> org.apache.commons.codec.language.ColognePhonetic encodings per msec: 289
>> So, Soundex is the fastest encoder. Caverphone is much slower than any other algorithm.
All others show off nearly the same performance.
>> Checked with the following code:
>> {code:java}
>>   private static final int REPEATS = 1000000;
>>   public void checkSpeed() throws Exception {
>>         checkSpeedEncoding(new Metaphone(), "easgasg", REPEATS);
>>         checkSpeedEncoding(new DoubleMetaphone(), "easgasg", REPEATS);
>>         checkSpeedEncoding(new Soundex(), "easgasg", REPEATS);
>>         checkSpeedEncoding(new RefinedSoundex(), "easgasg", REPEATS);
>>         checkSpeedEncoding(new Caverphone(), "Carlene", 100000);
>>         checkSpeedEncoding(new ColognePhonetic(), "Schmitt", REPEATS);
>>   }
>>
>>   private void checkSpeedEncoding(Encoder encoder, String toBeEncoded, int repeats)
throws Exception {
>>         long start = System.currentTimeMillis();
>>         for ( int i=0; i<repeats; i++) {
>>                   encoder.encode(toBeEncoded);
>>         }
>>         long duration = System.currentTimeMillis()-start;
>>         System.out.println(encoder.getClass().getName() + " encodings per msec:
"+(repeats/duration));
>>   }
>> {code}
>> _SNAP_
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message