Return-Path: Delivered-To: apmail-commons-commits-archive@minotaur.apache.org Received: (qmail 19015 invoked from network); 30 Jun 2009 05:47:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Jun 2009 05:47:54 -0000 Received: (qmail 83995 invoked by uid 500); 30 Jun 2009 05:48:04 -0000 Delivered-To: apmail-commons-commits-archive@commons.apache.org Received: (qmail 83916 invoked by uid 500); 30 Jun 2009 05:48:04 -0000 Mailing-List: contact commits-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@commons.apache.org Delivered-To: mailing list commits@commons.apache.org Received: (qmail 83907 invoked by uid 99); 30 Jun 2009 05:48:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2009 05:48:04 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2009 05:48:01 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id 1FDFC23888CC; Tue, 30 Jun 2009 05:47:40 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r789567 - /commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/LookupTranslator.java Date: Tue, 30 Jun 2009 05:47:40 -0000 To: commits@commons.apache.org From: bayard@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20090630054740.1FDFC23888CC@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: bayard Date: Tue Jun 30 05:47:39 2009 New Revision: 789567 URL: http://svn.apache.org/viewvc?rev=789567&view=rev Log: Performance improvement. Switching from looping through a doubled array to using a Map. This probably costs more for simple cases like Java/EcmaScript/Xml, but makes up for it in the Html case. This gets performance of the testUnescapeHexCharsHtml method back down to near the same region as the original code Modified: commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/LookupTranslator.java Modified: commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/LookupTranslator.java URL: http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/LookupTranslator.java?rev=789567&r1=789566&r2=789567&view=diff ============================================================================== --- commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/LookupTranslator.java (original) +++ commons/proper/lang/trunk/src/java/org/apache/commons/lang/text/translate/LookupTranslator.java Tue Jun 30 05:47:39 2009 @@ -18,14 +18,18 @@ import java.io.IOException; import java.io.Writer; +import java.util.HashMap; /** * Translates a value using a lookup table. * @since 3.0 */ +// TODO: Replace with a RegexLookup? Performance test. public class LookupTranslator extends CharSequenceTranslator { - protected CharSequence[][] lookup; + private HashMap lookupMap; + private int shortest = Integer.MAX_VALUE; + private int longest = 0; /** * Define the lookup table to be used in translation @@ -33,18 +37,34 @@ * @param CharSequence[][] Lookup table of size [*][2] */ public LookupTranslator(CharSequence[][] lookup) { - this.lookup = lookup; + lookupMap = new HashMap(); + for(CharSequence[] seq : lookup) { + this.lookupMap.put(seq[0], seq[1]); + int sz = seq[0].length(); + if(sz < shortest) { + shortest = sz; + } + if(sz > longest) { + longest = sz; + } + } } /** * {@inheritDoc} */ public int translate(CharSequence input, int index, Writer out) throws IOException { - CharSequence subsequence = input.subSequence(index, input.length()); - for(CharSequence[] seq : lookup) { - if( subsequence.toString().startsWith(seq[0].toString()) ) { - out.write(seq[1].toString()); - return seq[0].length(); + int max = longest; + if(index + longest > input.length()) { + max = input.length() - index; + } + // descend so as to get a greedy algorithm + for(int i=max; i >= shortest; i--) { + CharSequence subSeq = input.subSequence(index, index + i); + CharSequence result = lookupMap.get(subSeq); + if(result != null) { + out.write(result.toString()); + return i; } } return 0;