commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Wall (JIRA)" <>
Subject [jira] [Commented] (LANG-935) Possible performance improvement on string escape functions
Date Sun, 15 Dec 2013 12:43:07 GMT


Peter Wall commented on LANG-935:

Hi Sebb, I think you misunderstand my purpose in uploading the zip file.  I intended only
that people could review my code and, as Gary Gregory suggested, independently verify (or
not) my benchmark results.  It was never my intention that the code be used in its current
form - for one thing, it uses my own package names.

If people here think it's worthwhile, I am happy to modify the code to meet your standards,
but certain questions arise:  do you want the code to replicate exactly the current behaviour?
 If so, I think that would limit its value (the conversion of characters above 0x7F in HTML/XML,
for example).  Or should we create a new API with different class/method names, and allow
users to choose?  Is there anyone currently active in the project who has worked on or used
the current functions?  Do they have an opinion?

Please let me know how you think I should proceed.

(BTW, I don't understand why you're so hostile to LGPL!  But I own the code, and I'm happy
to assign it over to any non-restrictive licence.)

> Possible performance improvement on string escape functions
> -----------------------------------------------------------
>                 Key: LANG-935
>                 URL:
>             Project: Commons Lang
>          Issue Type: Improvement
>          Components: lang.text.translate.*
>    Affects Versions: 3.1
>            Reporter: Peter Wall
>            Priority: Minor
>              Labels: performance
>             Fix For: Patch Needed
>         Attachments:
> The escape functions for HTML etc. use the same code and the same initialisation tables
for the escape and unescape functions, and while this is an elegant approach it leads to a
number of deficiencies:
> 1. The code is very much less efficient than it could be
> 2. A new output string is created even when no conversion is required
> 3. No mapping is provided for characters that do not have a specific representation (for
example HTML 0x101 should become &amp;#257; )
> The proposal is to use a new mapping technique to address these issues

This message was sent by Atlassian JIRA

View raw message