Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@apache.org Received: (qmail 3369 invoked from network); 20 May 2003 16:54:22 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 20 May 2003 16:54:22 -0000 Received: (qmail 24186 invoked by uid 97); 20 May 2003 16:56:31 -0000 Delivered-To: qmlist-jakarta-archive-commons-dev@nagoya.betaversion.org Received: (qmail 24179 invoked from network); 20 May 2003 16:56:31 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 20 May 2003 16:56:31 -0000 Received: (qmail 3122 invoked by uid 500); 20 May 2003 16:54:19 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 3109 invoked from network); 20 May 2003 16:54:19 -0000 Received: from vpn1.seagullsw.com (HELO atlanta.seagullsw.com) (12.6.96.4) by daedalus.apache.org with SMTP; 20 May 2003 16:54:19 -0000 Received: by atlanta.seagullsw.com with Internet Mail Service (5.5.2656.59) id ; Tue, 20 May 2003 12:54:21 -0400 Message-ID: <245A7290F0E0D311BF6E009027E7908B0720452F@atlanta.seagullsw.com> From: Gary Gregory To: 'Jakarta Commons Developers List' Subject: [lang] and [collections] Primitives Date: Tue, 20 May 2003 12:54:19 -0400 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2656.59) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C31EF0.74AD3AC0" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C31EF0.74AD3AC0 Content-Type: text/plain Speaking of primitives and Collections: I was going to port our code to use .lang.StringEscapeUtils.escapeXml(String) but I thought I'd create a benchmark to compare our impl to .lang. Our impl, not being "entities-pluggable" and array based is 4x faster, so I cannot replace. :-( After running HP's Hpjmeter on some JVM -Xrunhprof output I confirmed what I was suspecting. I see the following issues in .lang.Entities.java: private static String escapeEntities(String str, Entities entities) { StringBuffer buf = new StringBuffer(str.length() * 2); int i; for (i = 0; i < str.length(); ++i) { char ch = str.charAt(i); String entity = entities.entityName(ch); (1) Java Strings are not great, ah, if they where Collections or iterable... For every single character in a string, the following happens: char ch = str.charAt(i); charAt() checks the bounds every time. Just for fun, I changed the code to: char[] chars = str.toByteArray(); for (i = 0; i < chars.length; ++i) { char ch = chars[i]; That yields a 10% speed improvement but gobbles up memory since the byte array returned by toByteArray is a _copy_ of the BA held by the string. Not great but a possible solution (and trade-off). So, the first question is: Is it worth creating a StringIterator-type of class that gets read-only access to a String's BA. Unfortunately this kind of code would have to use reflection to get a reference to the String's BA. (2) entities.entityName(ch) creates Integers. For every character in the String, entities.entityName(ch) is called, which in turn creates an Integer object for it char argument used in a Map lookup. That's a _lot_ of time spent in Integer.. This is where a primitive Map keyed on ints would come in handy. Is there a though of Collections providing such a class? If it did, it would seem a bit odd to have .lang depend on .collection. OTOH, Collections are now a basic part of the JRE. All comments welcome, thanks for reading, Gary -----Original Message----- From: Stephen Colebourne [mailto:scolebourne@btopenworld.com] Sent: Tuesday, May 20, 2003 01:15 To: Jakarta Commons Developers List Cc: Rodney Waldhoff Subject: [collections] Primitives Rodney, I'm noting that you are adding more implementations to the primitives area of [collections]. I had a few questions. 1) Are you code generating the classes, using Velocity or some other tool? I would have thought that it would be an ideal way to generate the classes, avoiding the messy search and replace and less testing. I also think that this technique will be increasingly important when we come to do primitive Sets and Maps. 1b) Talking of Maps...given the large number of classes, should primitive collections be split into packages for List, Set, Map? Now would be the time to do it. (primitive collections _could_ be a separate project in commons at the current rate...) 2) Why is it important to have separate Serializable and Non-Serializable implementations. Why not just make them all Serializable? 3) Can we agree on the naming strategy for the decorator package? I used AbstractCollectionDecorator, but you used BaseProxyIntList. Stephen --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org ------_=_NextPart_001_01C31EF0.74AD3AC0--