xerces-c-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mirko Braun" <mirko.br...@gmx.de>
Subject method startElement() from class DOMLSParserFilter
Date Thu, 03 Sep 2009 19:06:47 GMT
Hello everybody,

i would like to parse a quite large XML file (about 180 MB).
I used the DOM interface because i need the tree for further
processing of the data the xml file contains. Of course there
is a lot of memory used during parsing the file and i got an
"Out of memory" exception. 

I noticed that a class DOMLSParserFilter comes along wiht Xercesc C++ 3.0.1 (Win32), which
makes it possible to filter the Nodes during parsing.
That is perfect for me because one XML-Element in my large file
contains most of the data. This XML-Element is called DATA and
appears serveral time in my XML file.
So i had the idea to reject this XML-Element from the DOM tree
during parsing to reduce the used memory by using the method
startElement() of the DOMLSParserFilter class. After that i would
use a SAX parser and just get all XML-Elements DATA with their values.
But it does not work.
I integregated my code into the DOMPrint example which comes along
with Xercesc C++ 3.0.1. The following error message occurred: 

DOM Error during parsing: 'C:\Daten\2009-08-07_NewXercesc\3_0_1\xerces-c-3.0.1\Build\Win32\VC6\Debug\MyXML.xml'
DOMException code is:  3
Message is: attempt is made to insert a node where it is not permitted

Did i misunderstand the functionality of the DOMLSParserFilter class
and its method startElement?
It is possible to realize my idea with the help of this class? Did
i something wrong with in my code (please have a look below)?

I would be very grateful for any help.

Thanks in advanced,


class DOMParserFilter : public DOMLSParserFilter {

  DOMParserFilter(DOMNodeFilter::ShowType whatToShow = DOMNodeFilter::SHOW_ALL);

    virtual FilterAction startElement(DOMElement* node);
    virtual FilterAction acceptNode(DOMNode* node){return DOMParserFilter::FILTER_ACCEPT;};
    virtual DOMNodeFilter::ShowType getWhatToShow() const {return fWhatToShow;};

    DOMNodeFilter::ShowType fWhatToShow;


DOMParserFilter::DOMParserFilter(DOMNodeFilter::ShowType whatToShow)

DOMParserFilter::FilterAction DOMParserFilter::startElement(DOMElement* node)
  // for element whose name is "DATA", skip it
  if (XMLString::compareString(node->getNodeName(), element_data)==0)
    return DOMParserFilter::FILTER_REJECT;
    return DOMParserFilter::FILTER_ACCEPT;


static const XMLCh gLS[] = { xercesc::chLatin_L, xercesc::chLatin_S, xercesc::chNull };

xercesc::DOMImplementation *implParser = xercesc::DOMImplementationRegistry::getDOMImplementation(gLS);

xercesc::DOMLSParser* parser = ((xercesc::DOMImplementationLS*)implParser)->createLSParser(xercesc::DOMImplementationLS::MODE_SYNCHRONOUS,

DOMTreeErrorReporter *errReporter = new DOMTreeErrorReporter();
parser->getDomConfig()->setParameter(xercesc::XMLUni::fgDOMErrorHandler, errReporter);
DOMParserFilter * pDOMParserFilter = new DOMParserFilter();

    //  Parse the XML file, catching any XML exceptions that might propogate
    //  out of it.
    bool errorsOccured = false;
    DOMDocument *doc = NULL;

      doc = parser->parseURI(gXmlFile);
    catch (const OutOfMemoryException&)
        XERCES_STD_QUALIFIER cerr << "OutOfMemoryException" << XERCES_STD_QUALIFIER
        errorsOccured = true;
    catch (const XMLException& e)
        XERCES_STD_QUALIFIER cerr << "An error occurred during parsing\n   Message:
             << StrX(e.getMessage()) << XERCES_STD_QUALIFIER endl;
        errorsOccured = true;

    catch (const DOMException& e)
      const unsigned int maxChars = 2047;
      XMLCh errText[maxChars + 1];

      XERCES_STD_QUALIFIER cerr << "\nDOM Error during parsing: '" << gXmlFile
<< "'\n"
           << "DOMException code is:  " << e.code << XERCES_STD_QUALIFIER

      if (DOMImplementation::loadDOMExceptionMsg(e.code, errText, maxChars))
           XERCES_STD_QUALIFIER cerr << "Message is: " << StrX(errText) <<

      errorsOccured = true;

    catch (...)
        XERCES_STD_QUALIFIER cerr << "An error occurred during parsing\n " <<
        errorsOccured = true;

View raw message