commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Chaffee / Purple Technology <>
Subject Re: [lang] and [collections] Primitives
Date Sat, 24 May 2003 03:14:18 GMT
On Tue, May 20, 2003 at 12:54:19PM -0400, Gary Gregory wrote:
> Speaking of primitives and Collections: I was going to port our code to use
> .lang.StringEscapeUtils.escapeXml(String) but I thought I'd create a
> benchmark to compare our impl to .lang. Our impl, not being
> "entities-pluggable" and array based is 4x faster, so I cannot replace. :-( has never been optimized for performance.  I'd love to
start optimizing it with you.  Why don't I lay in some unit tests and
then we can take a crack at it.  

Can you show me your implementation for inspiration?  

(By the way, you made a Freudian typo: escapeEntities is currently a
method of StringEscapeUtils, not of Entities.  But from your mistake
it's now clear that it really belongs in Entities!

>From StringEscapeUtils:
 private static String escapeEntities(String str, Entities entities)

to Entities:
 public String escape(String str)

...much more object-oriented.)

> So, the first question is: Is it worth creating a StringIterator-type of
> class that gets read-only access to a String's BA. Unfortunately this kind
> of code would have to use reflection to get a reference to the String's BA. 

Sounds very spooky and perhaps brittle wrt alternate VM
implementations...  10% is indeed significant but I'd hope we could
find a lower-hanging fruit.

> (2) entities.entityName(ch) creates Integers.
> For every character in the String, entities.entityName(ch) is called, which
> in turn creates an Integer object for it char argument used in a Map lookup.
> That's a _lot_ of time spent in Integer.<init>...
> This is where a primitive Map keyed on ints would come in handy. Is there a
> though of Collections providing such a class?
> If it did, it would seem a bit odd to have .lang depend on .collection.

I agree that such a class would be very useful here.  [lang] can
either depend on, steal from, or ignore [collections] as needed.  I
don't see an IntMap there yet, but it wouldn't be too painful to write
one and either keep it private or donate it back to [collections].

Also, it's entirely possible that storing the entities in an array
instead of a hashtable would result in perfectly acceptable
performance, since there are only a few dozen entries...

Alex Chaffee                     
Purple Technology - Code and Consulting
jGuru - Java News and FAQs       
Gamelan - the Original Java site 
Stinky - Art and Angst           

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message