Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3B001797C for ; Fri, 13 Mar 2015 14:21:38 +0000 (UTC) Received: (qmail 90564 invoked by uid 500); 13 Mar 2015 14:21:38 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 90446 invoked by uid 500); 13 Mar 2015 14:21:38 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 90434 invoked by uid 99); 13 Mar 2015 14:21:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2015 14:21:38 +0000 Date: Fri, 13 Mar 2015 14:21:38 +0000 (UTC) From: "Fabian Lange (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (LANG-935) Possible performance improvement on string escape functions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LANG-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360406#comment-14360406 ] Fabian Lange commented on LANG-935: ----------------------------------- my PR adds a mapping by the first char of the translation match. It improves the situation a lot. Further optimization could be done by by building a complete match tree. If that is desired my PR provides now a sufficient base for that. > Possible performance improvement on string escape functions > ----------------------------------------------------------- > > Key: LANG-935 > URL: https://issues.apache.org/jira/browse/LANG-935 > Project: Commons Lang > Issue Type: Improvement > Components: lang.text.translate.* > Affects Versions: 3.1 > Reporter: Peter Wall > Priority: Minor > Labels: performance > Fix For: Patch Needed > > Attachments: tempproject1.zip > > > The escape functions for HTML etc. use the same code and the same initialisation tables for the escape and unescape functions, and while this is an elegant approach it leads to a number of deficiencies: > 1. The code is very much less efficient than it could be > 2. A new output string is created even when no conversion is required > 3. No mapping is provided for characters that do not have a specific representation (for example HTML 0x101 should become ā ) > The proposal is to use a new mapping technique to address these issues -- This message was sent by Atlassian JIRA (v6.3.4#6332)