xerces-j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Glavassevich <mrgla...@ca.ibm.com>
Subject Re: Interning strategy
Date Thu, 10 Jan 2008 03:52:29 GMT
Hi Dave,

The strings didn't need to be interned for Xerces' internals to work
correctly (though the code has since evolved to depend on that now). It's
just cheaper to do the intern once and cache it in the SymbolTable than to
do it later, possibly multiple times at the API layer. Some history here
[1] if you're interested.

Thanks.

[1] http://issues.apache.org/jira/browse/XERCESJ-6

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Dave Brosius" <dbrosius@apache.org> wrote on 01/09/2008 10:27:20 PM:

> Clearly based on your response, and the fact that the Soft referenced
table
> also interns, i completely misunderstood (and still do) what the
SymbolTable
> class is used for.
>
> I guess i'll have to take another attempt at understanding what it is
being
> used for.
>
>
> ----- Original Message -----
> From: "Michael Glavassevich" <mrglavas@ca.ibm.com>
> To: <j-dev@xerces.apache.org>
> Sent: Wednesday, January 09, 2008 4:16 PM
> Subject: Re: Interning strategy
>
>
> > Hi Dave,
> >
> > It's being interned for the application. Allows your SAX content
handler
> > to
> > compare the names of elements, attributes, etc... using reference
> > comparison [1] instead of equals for better performance. There's an
> > alternate implementation of the SymbolTable [2] which is more sensitive
to
> > memory usage. It allows interned strings to be garbage collected if
> > they're
> > only reachable through the SymbolTable.
> >
> > Thanks.
> >
> > [1] http://xerces.apache.org/xerces2-j/features.html#string-interning
> > [2]
> > http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: mrglavas@ca.ibm.com
> > E-mail: mrglavas@apache.org
> >
> > "Dave Brosius" <dbrosius@mebigfatguy.com> wrote on 01/09/2008 01:01:06
AM:
> >
> >> Greetings, i was purusing old mailing list emails, and stumbled onto
the
> >> following email sent some time ago :)
> >>
> >> Luckily, from a quick perusal of the code, it appears that the email
> > still
> >> applies.
> >>
> >> I have a question about the implementation of SymbolTable
> >>
> >> As expected, it appears to me to that it does hashing to find a
bucket,
> > then
> >> walks the chain of pointers from the bucket to find a string that is
> >> 'equals'
> >>
> >> Only if it doesn't exist is a new one added. All of this makes sense.
> >>
> >> The question i have then, is why when you add an entry
> >>
> >> public Entry(String symbol, Entry next) {
> >>     this.symbol = symbol.intern();
> >>     characters = new char[symbol.length()];
> >>     symbol.getChars(0, characters.length, characters, 0);
> >>     this.next = next;
> >> }
> >>
> >> does the code intern the string? Isn't the point of this class to stop
> >> pollution of the constant pool and perm gen? (besides allowing for
> > alternate
> >> hashing?)
> >> Given that the one String that lives in the SymbolTable is returned, i
> > would
> >> think intern is redundant.
> >>
> >> thanks,
> >> dave
> >>
> >> ----- Original Message -----
> >> From: "Michael Glavassevich" <mrglavas@ca.ibm.com>
> >> To: <j-dev@xerces.apache.org>
> >> Sent: Sunday, July 24, 2005 11:57 AM
> >> Subject: Re: Interning strategy
> >>
> >>
> >> Elliotte Harold <elharo@metalab.unc.edu> wrote on 07/22/2005 09:35:02
PM:
> >>
> >> > Suppose I turn on interning in the parser by setting the SAX
property
> >> > http://xml.org/sax/features/string-interning to true. Will Xerces
> > simply
> >>
> >> > invoke the String.intern() method on the strings it creates or does
it
> >> > do something fancier like maintaining its own pool of string
constants
> >> > and reuse those?
> >>
> >> It maintains a pool. See org.apache.xerces.util.SymbolTable,
specifically
> >> the addSymbol() methods.
> >>
> >> > --
> >> > Elliotte Rusty Harold  elharo@metalab.unc.edu
> >> > XML in a Nutshell 3rd Edition Just Published!
> >> > http://www.cafeconleche.org/books/xian3/
> >> >
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
> >> >
> >> >
---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> >> > For additional commands, e-mail: j-dev-help@xerces.apache.org
> >> >
> >>
> >> Michael Glavassevich
> >> XML Parser Development
> >> IBM Toronto Lab
> >> E-mail: mrglavas@ca.ibm.com
> >> E-mail: mrglavas@apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> >> For additional commands, e-mail: j-dev-help@xerces.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> >> For additional commands, e-mail: j-dev-help@xerces.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-dev-help@xerces.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-dev-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-dev-help@xerces.apache.org


Mime
View raw message