xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Pogue <mpo...@apache.org>
Subject Re: cpu usage
Date Tue, 30 May 2000 18:21:38 GMT
Check out the XML spec, at: 

	http://www.w3.org/TR/1998/REC-xml-19980210

XML parsers have to do all that, and more (e.g. SAX event generation, DOM tree
generation). 

It's easy to write a scanner, that just looks for '<' and '>', but it's much harder
to
write a parser that follows all the XML spec rules, generates required errors at the right
places, generates events, builds the right data structures, handles whitespace correctly,
etc.

And, in any program, SOMETHING will take up much of the time.  In a parser, it's string
manipulation.  Note that Xerces-J takes great pains to avoid the time consuming
java.lang.String functions, which are costly in time.  (Notice that they weren't on your
list of high runner functions!)

Mike


Joseph Shraibman wrote:
> 
> I'm not even sure exactly what they do.  why should it be so much work
> to read in a character stream?  Especially when I started off with a
> String to begin with?
> 
> Mike Pogue wrote:
> >
> > Yep.  XML parsing is basically string processing.
> >
> > Next step: if you can figure out a way to make those functions simpler/quicker,
> > please let us know!
> >
> > Mike
> >
> > Joseph Shraibman wrote:
> > >
> > > I recently ran a program of mine through Optimizeit.  This program does
> > > a lot of parsing of Strings (not deffered).  It spent:
> > >
> > > 67 % in org.apache.xerces.readers.CharReader.fillCurrentChunk()
> > > 15.62% in com.xtenit.xml.XParserXerces.<init>()
> > >      including:
> > >          9.53% org.apache.xerces.utils.SymbolCache.<init>()
> > >          3.28% org.apache.xerces.utils.SymbolCache.addSymbolToCache()
> > >
> > > ---------------------------------------------------------------------
> > > In case of troubles, e-mail:     webmaster@xml.apache.org
> > > To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> > > For additional commands, e-mail: general-help@xml.apache.org
> >
> > ---------------------------------------------------------------------
> > In case of troubles, e-mail:     webmaster@xml.apache.org
> > To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> > For additional commands, e-mail: general-help@xml.apache.org
> 
> ---------------------------------------------------------------------
> In case of troubles, e-mail:     webmaster@xml.apache.org
> To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org

Mime
View raw message