xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nath" <nath_me...@hotmail.com>
Subject Re: XML performance problems with xerces c++
Date Tue, 25 May 2004 16:41:05 GMT
Thanks, I'll resubmit my post over there. I just wasn't sure if that mailing
list was specific for xerces development, rather than menial troubleshooting
problems.

----- Original Message ----- 
From: "Neil Graham" <neilg@ca.ibm.com>
To: <general@xml.apache.org>
Sent: Tuesday, May 25, 2004 11:21 AM
Subject: Re: XML performance problems with xerces c++


>
>
>
>
>
> Hi Nath,
>
> You really don't want to use this list for such questions; better to use
> the Xerces-C-specific list here [*].
>
> But here are some thoughts:  I don't understand what you mean when you
> write "It seems the larger the XML file, the longer it takes to parse
> individual nodes."  When Xerces returns a DOM document to you, it has
> already parsed the entire document; it doesn't go off and parse more of it
> as you move down the list of children of the root element.  And, if all
you
> want is information from the children of the root element, you may well
> wish to use SAX; the DOM is inherently both processor- and
> memory-intensive.
>
> Cheers,
> Neil
>
> [*]:  http://xml.apache.org/mail.html#xerces-c-dev
> Neil Graham
> XML Parser Development
> IBM Toronto Lab
> Phone:  905-413-3519, T/L 969-3519
> E-mail:  neilg@ca.ibm.com
>
>
>
>
>
>                       "Nath"
>                       <nath_meyer@hotma        To:
<general@xml.apache.org>
>                       il.com>                  cc:
>                                                Subject:  XML performance
problems with xerces c++
>                       05/24/2004 10:56
>                       PM
>                       Please respond to
>                       general
>
>
>
>
>
> I converted over a dictionary of words and definitions into XML files (one
> file per letter of the alphabet), each weighing around 1-5 megs (I chose
> XML
> over a DB for important reasons). I'm trying to parse these files and it's
> taking an incredible amount of time to do it. When parsing small files
> (letters X, Y, and Z - a total of 815 words or 151 KB) the parser can do
so
> in less than 2 seconds. When parsing the letter A file (40,000 some words
> or
> 1.58 megs), it takes 5 seconds just to parse 20 words. It seems the larger
> the XML file, the longer it takes to parse individual nodes. Can anyone
> suggest why this is happening and how I can fix it? I've used xerces c++
> 2.4.0 and recently upgraded to xerces 2.5.0.
>
>
>
> I'm just following the standard XML start-up and DOM parsing procedure
>
> - Initialize platform utils
>
> - Don't validate files
>
> - parse and assign DOM document
>
> - go through each child node and collect data
>
>
>
> I have a 1600MHz processor, so handling a few meg files should be fairly
> quick.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message