xerces-c-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mirko Braun" <mirko.br...@gmx.de>
Subject Re: method startElement() from class DOMLSParserFilter
Date Fri, 04 Sep 2009 13:20:37 GMT
Hi Alberto,

yes i'm sure that DATA is not a root node. I debugged a little bit.
The exception occurs after the sixth time this DATA node was found.

Mirko

-------- Original-Nachricht --------
> Datum: Fri, 04 Sep 2009 14:21:15 +0200
> Von: Alberto Massari <amassari@datadirect.com>
> An: c-users@xerces.apache.org
> Betreff: Re: method startElement() from class DOMLSParserFilter

> Hi Mirko,
> are you sure that your root node isn't one of those DATA elements? In 
> this case the document node would see more than one root element.
> 
> Alberto
> 
> Mirko Braun wrote:
> > Hi Alberto,
> >
> > thank you for you answer. I integrated the changes you
> > suggested, but the result is still the same:
> >
> > DOM Error during parsing:
> >
> 'C:\Daten\2009-08-07_NewXercesc\3_0_1\xerces-c-3.0.1\Build\Win32\VC6\Debug\MyXML.xml'
> > DOMException code is:  3
> > Message is: attempt is made to insert a node where it is not permitted
> >
> > Best regards,
> > Mirko
> >
> > -------- Original-Nachricht --------
> >   
> >> Datum: Fri, 04 Sep 2009 12:37:10 +0200
> >> Von: Alberto Massari <amassari@datadirect.com>
> >> An: c-users@xerces.apache.org
> >> Betreff: Re: method startElement() from class DOMLSParserFilter
> >>     
> >
> >   
> >> Hi Mirko,
> >> I think the current implementation of the DOMLSParserFilter doesn't
> work 
> >> nicely with your code, as the rejected nodes are not recycled and the 
> >> memory will grow to the same level as before.
> >> Anyhow, you should instead override acceptNode like this:
> >>
> >> DOMParserFilter::FilterAction DOMParserFilter::acceptNode(DOMElement*
> >> node)
> >> {
> >>   // for element whose name is "DATA", skip it
> >>    if (node->getNodeType()==DOMNode::ELEMENT_NODE && 
> >> XMLString::compareString(node->getNodeName(), element_data)==0)
> >>      return DOMParserFilter::FILTER_REJECT;
> >>   else
> >>     return DOMParserFilter::FILTER_ACCEPT;
> >> }
> >>
> >> Then, change DOMLSParserImpl::endElement to add a call to 
> >> origNode->release() after the call to removeChild().
> >>
> >> Alberto
> >>
> >>
> >> Mirko Braun wrote:
> >>     
> >>> Hello everybody,
> >>>
> >>> i would like to parse a quite large XML file (about 180 MB).
> >>> I used the DOM interface because i need the tree for further
> >>> processing of the data the xml file contains. Of course there
> >>> is a lot of memory used during parsing the file and i got an
> >>> "Out of memory" exception. 
> >>>
> >>> I noticed that a class DOMLSParserFilter comes along wiht Xercesc C++
> >>>       
> >> 3.0.1 (Win32), which makes it possible to filter the Nodes during
> parsing.
> >>     
> >>> That is perfect for me because one XML-Element in my large file
> >>> contains most of the data. This XML-Element is called DATA and
> >>> appears serveral time in my XML file.
> >>> So i had the idea to reject this XML-Element from the DOM tree
> >>> during parsing to reduce the used memory by using the method
> >>> startElement() of the DOMLSParserFilter class. After that i would
> >>> use a SAX parser and just get all XML-Elements DATA with their values.
> >>> But it does not work.
> >>> I integregated my code into the DOMPrint example which comes along
> >>> with Xercesc C++ 3.0.1. The following error message occurred: 
> >>>
> >>> DOM Error during parsing:
> >>>       
> >>
> 'C:\Daten\2009-08-07_NewXercesc\3_0_1\xerces-c-3.0.1\Build\Win32\VC6\Debug\MyXML.xml'
> >>     
> >>> DOMException code is:  3
> >>> Message is: attempt is made to insert a node where it is not permitted
> >>>
> >>>
> >>> Did i misunderstand the functionality of the DOMLSParserFilter class
> >>> and its method startElement?
> >>> It is possible to realize my idea with the help of this class? Did
> >>> i something wrong with in my code (please have a look below)?
> >>>
> >>> I would be very grateful for any help.
> >>>
> >>> Thanks in advanced,
> >>> Mirko
> >>>
> >>>
> >>> DOMPrintFilter.hpp:
> >>> --------------------
> >>>
> >>>
> >>> class DOMParserFilter : public DOMLSParserFilter {
> >>> public:
> >>>
> >>>   DOMParserFilter(DOMNodeFilter::ShowType whatToShow =
> >>>       
> >> DOMNodeFilter::SHOW_ALL);
> >>     
> >>>     ~DOMParserFilter(){};
> >>>
> >>>     virtual FilterAction startElement(DOMElement* node);
> >>>     virtual FilterAction acceptNode(DOMNode* node){return
> >>>       
> >> DOMParserFilter::FILTER_ACCEPT;};
> >>     
> >>>     virtual DOMNodeFilter::ShowType getWhatToShow() const {return
> >>>       
> >> fWhatToShow;};
> >>     
> >>> private:
> >>>     DOMNodeFilter::ShowType fWhatToShow;
> >>> };
> >>>
> >>>
> >>> DOMPrintFilter.cpp:
> >>> --------------------
> >>>
> >>> DOMParserFilter::DOMParserFilter(DOMNodeFilter::ShowType whatToShow)
> >>> :fWhatToShow(whatToShow)
> >>> {}
> >>>
> >>> DOMParserFilter::FilterAction
> DOMParserFilter::startElement(DOMElement*
> >>>       
> >> node)
> >>     
> >>> {
> >>>   // for element whose name is "DATA", skip it
> >>>   if (XMLString::compareString(node->getNodeName(), element_data)==0)
> >>>     return DOMParserFilter::FILTER_REJECT;
> >>>   else
> >>>     return DOMParserFilter::FILTER_ACCEPT;
> >>> }
> >>>
> >>>
> >>> DOMPrint.cpp:
> >>> ---------------
> >>>
> >>> static const XMLCh gLS[] = { xercesc::chLatin_L, xercesc::chLatin_S,
> >>>       
> >> xercesc::chNull };
> >>     
> >>> xercesc::DOMImplementation *implParser =
> >>>       
> >> xercesc::DOMImplementationRegistry::getDOMImplementation(gLS);
> >>     
> >>> xercesc::DOMLSParser* parser =
> >>>       
> >>
> ((xercesc::DOMImplementationLS*)implParser)->createLSParser(xercesc::DOMImplementationLS::MODE_SYNCHRONOUS,
0);
> >>     
> >>>
> >>> DOMTreeErrorReporter *errReporter = new DOMTreeErrorReporter();
> >>>
> parser->getDomConfig()->setParameter(xercesc::XMLUni::fgDOMErrorHandler,
> >>>       
> >> errReporter);
> >>     
> >>>     
> >>> DOMParserFilter * pDOMParserFilter = new DOMParserFilter();
> >>> parser->setFilter(pDOMParserFilter);
> >>>     
> >>>
> >>>     //
> >>>     //  Parse the XML file, catching any XML exceptions that might
> >>>       
> >> propogate
> >>     
> >>>     //  out of it.
> >>>     //
> >>>     bool errorsOccured = false;
> >>>     DOMDocument *doc = NULL;
> >>>
> >>>     try
> >>>     {
> >>>       doc = parser->parseURI(gXmlFile);
> >>>     }
> >>>     catch (const OutOfMemoryException&)
> >>>     {
> >>>         XERCES_STD_QUALIFIER cerr << "OutOfMemoryException" <<
> >>>       
> >> XERCES_STD_QUALIFIER endl;
> >>     
> >>>         errorsOccured = true;
> >>>     }
> >>>     catch (const XMLException& e)
> >>>     {
> >>>         XERCES_STD_QUALIFIER cerr << "An error occurred during
> parsing\n
> >>>       
> >>   Message: "
> >>     
> >>>              << StrX(e.getMessage()) << XERCES_STD_QUALIFIER
endl;
> >>>         errorsOccured = true;
> >>>     }
> >>>
> >>>     catch (const DOMException& e)
> >>>     {
> >>>       const unsigned int maxChars = 2047;
> >>>       XMLCh errText[maxChars + 1];
> >>>
> >>>       XERCES_STD_QUALIFIER cerr << "\nDOM Error during parsing: '"
<<
> >>>       
> >> gXmlFile << "'\n"
> >>     
> >>>            << "DOMException code is:  " << e.code <<
> >>>       
> >> XERCES_STD_QUALIFIER endl;
> >>     
> >>>       if (DOMImplementation::loadDOMExceptionMsg(e.code, errText,
> >>>       
> >> maxChars))
> >>     
> >>>            XERCES_STD_QUALIFIER cerr << "Message is: " <<
> StrX(errText)
> >>>       
> >> << XERCES_STD_QUALIFIER endl;
> >>     
> >>>       errorsOccured = true;
> >>>     }
> >>>
> >>>     catch (...)
> >>>     {
> >>>         XERCES_STD_QUALIFIER cerr << "An error occurred during
> parsing\n
> >>>       
> >> " << XERCES_STD_QUALIFIER endl;
> >>     
> >>>         errorsOccured = true;
> >>>     }
> >>>
> >>>
> >>>
> >>>
> >>>   
> >>>       
> >
> >   

Mime
View raw message