commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henri Yandell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LANG-293) StringEscapeUtils.unescape* can be faster
Date Tue, 07 Jul 2009 07:16:14 GMT

    [ https://issues.apache.org/jira/browse/LANG-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727952#action_12727952
] 

Henri Yandell commented on LANG-293:
------------------------------------

Needs to be rethought after rewrite of Entities class into text.translate. However the idea
still holds. Possibly an optimization for LookupTranslators such that they can optionally
define a set of characters to check that the absence of makes them short circuit. ie) no &
and ; then it short circuits etc.

Alternatively - that might be a different translator - an OptimizationUnlessTranslator. If
it can't find the passed in characters, it passes the whole string through.

> StringEscapeUtils.unescape* can be faster
> -----------------------------------------
>
>                 Key: LANG-293
>                 URL: https://issues.apache.org/jira/browse/LANG-293
>             Project: Commons Lang
>          Issue Type: Improvement
>    Affects Versions: Nightly Builds
>            Reporter: Stepan Koltsov
>             Fix For: 3.0
>
>         Attachments: commons-lang-unescape-performace2-stepancheg-2006-10-31.diff, EntitiesPerformance2TestSecret.java
>
>
> Typical string that need to be unescaped contains almost no XML entities, so copying
input string to output buffer char by char is slow.
> I've refactored Entities.unescape() so it works faster. Going to submitting patch and
tests.
> Patch contains both hacked and original versions of unescape, to run tests.
> Test shows that performance remains same on short strings, and much better or large strings
with rare entities.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message