xerces-j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Glavassevich <mrgla...@ca.ibm.com>
Subject Re: Interning strategy
Date Wed, 09 Jan 2008 21:16:18 GMT
Hi Dave,

It's being interned for the application. Allows your SAX content handler to
compare the names of elements, attributes, etc... using reference
comparison [1] instead of equals for better performance. There's an
alternate implementation of the SymbolTable [2] which is more sensitive to
memory usage. It allows interned strings to be garbage collected if they're
only reachable through the SymbolTable.

Thanks.

[1] http://xerces.apache.org/xerces2-j/features.html#string-interning
[2]
http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Dave Brosius" <dbrosius@mebigfatguy.com> wrote on 01/09/2008 01:01:06 AM:

> Greetings, i was purusing old mailing list emails, and stumbled onto the
> following email sent some time ago :)
>
> Luckily, from a quick perusal of the code, it appears that the email
still
> applies.
>
> I have a question about the implementation of SymbolTable
>
> As expected, it appears to me to that it does hashing to find a bucket,
then
> walks the chain of pointers from the bucket to find a string that is
> 'equals'
>
> Only if it doesn't exist is a new one added. All of this makes sense.
>
> The question i have then, is why when you add an entry
>
> public Entry(String symbol, Entry next) {
>     this.symbol = symbol.intern();
>     characters = new char[symbol.length()];
>     symbol.getChars(0, characters.length, characters, 0);
>     this.next = next;
> }
>
> does the code intern the string? Isn't the point of this class to stop
> pollution of the constant pool and perm gen? (besides allowing for
alternate
> hashing?)
> Given that the one String that lives in the SymbolTable is returned, i
would
> think intern is redundant.
>
> thanks,
> dave
>
> ----- Original Message -----
> From: "Michael Glavassevich" <mrglavas@ca.ibm.com>
> To: <j-dev@xerces.apache.org>
> Sent: Sunday, July 24, 2005 11:57 AM
> Subject: Re: Interning strategy
>
>
> Elliotte Harold <elharo@metalab.unc.edu> wrote on 07/22/2005 09:35:02 PM:
>
> > Suppose I turn on interning in the parser by setting the SAX property
> > http://xml.org/sax/features/string-interning to true. Will Xerces
simply
>
> > invoke the String.intern() method on the strings it creates or does it
> > do something fancier like maintaining its own pool of string constants
> > and reuse those?
>
> It maintains a pool. See org.apache.xerces.util.SymbolTable, specifically
> the addSymbol() methods.
>
> > --
> > Elliotte Rusty Harold  elharo@metalab.unc.edu
> > XML in a Nutshell 3rd Edition Just Published!
> > http://www.cafeconleche.org/books/xian3/
> > http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-dev-help@xerces.apache.org
> >
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Mime
View raw message