commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Chaffee / Purple Technology <g...@stinky.com>
Subject Re: [lang] and [collections] Primitives
Date Sat, 24 May 2003 03:14:18 GMT
On Tue, May 20, 2003 at 12:54:19PM -0400, Gary Gregory wrote:
> Speaking of primitives and Collections: I was going to port our code to use
> .lang.StringEscapeUtils.escapeXml(String) but I thought I'd create a
> benchmark to compare our impl to .lang. Our impl, not being
> "entities-pluggable" and array based is 4x faster, so I cannot replace. :-(

Entities.java has never been optimized for performance.  I'd love to
start optimizing it with you.  Why don't I lay in some unit tests and
then we can take a crack at it.  

Can you show me your implementation for inspiration?  

(By the way, you made a Freudian typo: escapeEntities is currently a
method of StringEscapeUtils, not of Entities.  But from your mistake
it's now clear that it really belongs in Entities!

>From StringEscapeUtils:
 private static String escapeEntities(String str, Entities entities)

to Entities:
 public String escape(String str)

...much more object-oriented.)


> So, the first question is: Is it worth creating a StringIterator-type of
> class that gets read-only access to a String's BA. Unfortunately this kind
> of code would have to use reflection to get a reference to the String's BA. 

Sounds very spooky and perhaps brittle wrt alternate VM
implementations...  10% is indeed significant but I'd hope we could
find a lower-hanging fruit.


> (2) entities.entityName(ch) creates Integers.
> 
> For every character in the String, entities.entityName(ch) is called, which
> in turn creates an Integer object for it char argument used in a Map lookup.
> That's a _lot_ of time spent in Integer.<init>...
>
> This is where a primitive Map keyed on ints would come in handy. Is there a
> though of Collections providing such a class?
> 
> If it did, it would seem a bit odd to have .lang depend on .collection.

I agree that such a class would be very useful here.  [lang] can
either depend on, steal from, or ignore [collections] as needed.  I
don't see an IntMap there yet, but it wouldn't be too painful to write
one and either keep it private or donate it back to [collections].

Also, it's entirely possible that storing the entities in an array
instead of a hashtable would result in perfectly acceptable
performance, since there are only a few dozen entries...


-- 
Alex Chaffee                               mailto:alex@jguru.com
Purple Technology - Code and Consulting    http://www.purpletech.com/
jGuru - Java News and FAQs                 http://www.jguru.com/alex/
Gamelan - the Original Java site           http://www.gamelan.com/
Stinky - Art and Angst                     http://www.stinky.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message