xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Raskin <dras...@rimage.com>
Subject Xerces-C user question
Date Fri, 23 Aug 2002 13:44:24 GMT
I didn't see a specific mailing list for Xerces-C, so I am sending my
question here, please help!
I am using Xerces 1.7, C++ version on Windows and am having problem parsing
a unicode XML string with the SAX parser. 
I have no problems doing this when my code is compiled for _MBCS. 
When my code is compiled for _UNICODE, the parser never calls
resolveEntity() or startElement(), or anything else except startDocument()
and endDocument().
I stepped through the Xerces code and it seems that somewhere along the way
extra null bytes are added after each character (which is already 2 bytes
long in Unicode).  After this, tokenizer routines detect a premature EOF and
the code returns without parsing anymore of the document.
Is this a bug in Xerces or am I doing something wrong.  Here's a sample of
my code (handler is an extention of DefaultHandler):
 SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
  parser->setFeature(L" http://xml.org/sax/features/validation
<http://xml.org/sax/features/validation> ", true);

      errorCount = 0;
      int tcharSize = sizeof _TCHAR;
      int length = _tcslen(xmlString) * tcharSize;
      MemBufInputSource inStr((XMLByte*)xmlString, length, _T("fakeBufId"));
      errorCount = parser->getErrorCount();
 catch (const XMLException& e) 
      throw e;
 catch (const SAXParseException& e) 
      throw e;

thanks, any and all help is appreciated!
Dave Raskin
Rimage Corporation

View raw message