commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henri Yandell (JIRA)" <>
Subject [jira] Commented: (LANG-293) StringEscapeUtils.unescape* can be faster
Date Tue, 07 Jul 2009 07:16:14 GMT


Henri Yandell commented on LANG-293:

Needs to be rethought after rewrite of Entities class into text.translate. However the idea
still holds. Possibly an optimization for LookupTranslators such that they can optionally
define a set of characters to check that the absence of makes them short circuit. ie) no &
and ; then it short circuits etc.

Alternatively - that might be a different translator - an OptimizationUnlessTranslator. If
it can't find the passed in characters, it passes the whole string through.

> StringEscapeUtils.unescape* can be faster
> -----------------------------------------
>                 Key: LANG-293
>                 URL:
>             Project: Commons Lang
>          Issue Type: Improvement
>    Affects Versions: Nightly Builds
>            Reporter: Stepan Koltsov
>             Fix For: 3.0
>         Attachments: commons-lang-unescape-performace2-stepancheg-2006-10-31.diff,
> Typical string that need to be unescaped contains almost no XML entities, so copying
input string to output buffer char by char is slow.
> I've refactored Entities.unescape() so it works faster. Going to submitting patch and
> Patch contains both hacked and original versions of unescape, to run tests.
> Test shows that performance remains same on short strings, and much better or large strings
with rare entities.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message