xerces-c-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Freimann, Mario" <mario.freim...@siemens.com>
Subject AW: Reading german umlaute from xml file
Date Mon, 28 Jul 2008 08:06:08 GMT
Dear David,

thanks for the hint. Switching to transcodeTo() and using ISO-8859-1, the transcoding works
like a charme.

One question related to this. I want to write a xml document with the content

[[xml example]]

<?xml version="1.0" encoding="UTF-16" standalone="no" ?><xtestroot><test>M&#252;nchen</test></xtestroot>

The code I use looks like the following:


        utf8Transcoder = XMLPlatformUtils::fgTransService->makeNewTranscoderFor("ISO-8859-1",
failReason, 16*1024);
        DOMImplementation* _impl2 = DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
        DOMDocument* _doc2 = _impl2->createDocument(0, XMLString::transcode("xtestroot"),
        DOMElement* _rootElem2 = _doc2->getDocumentElement();

        DOMElement* nextNode2 = _doc2->createElement(XMLString::transcode("test"));
        char* xData = "München";
        XMLCh* xOut = new XMLCh[ strlen(xData) * 6 + 1 ];
        int xRet = utf8Transcoder->transcodeFrom( (const XMLByte*)xData, strlen(xData),
xOut, strlen(xData) * 6 + 1, eaten, charSizes );
        printf("debug xData [%s], eaten [%d], charSizes [%s], xRet [%d], xOut [%s]", xData,
eaten, charSizes, xRet, xOut );
        DOMText* nodeData2 = _doc2->createTextNode( xOut );

        DOMWriter* theSerializer2 = ((DOMImplementationLS*)_impl2)->createDOMWriter();
        char* text2 = XMLString::transcode(theSerializer2->writeToString(*_doc2));

The debugging printf shows that transcodeFrom() did its work. But unfortunately, printing
the text2 variable doesn't give anything. I don't understand why the text isn't printed. Any
help is appreciated.


debug xData [München], eaten [7], charSizes [], xRet [7], xOut [M]

With kind regards,

Mario Freimann

-----Ursprüngliche Nachricht-----
Von: David Bertoni [mailto:dbertoni@apache.org] 
Gesendet: Freitag, 25. Juli 2008 21:57
An: c-users@xerces.apache.org
Betreff: Re: Reading german umlaute from xml file

Freimann, Mario wrote:
> Dear mailing list users,
> I have a xml file I want to process with xerces. I now have a problem with german umlaute.
I extracted the code I use to show the problem. After reading the mailing list I switched
from XMLString::transcode to a utf8 transcoder, but it doesn't work either. The problematic
platform is Red Hat Enterprise Linux (3 32Bit and 5 64Bit). The XMLString transcode works
well on a Solaris 10 platform, the transcoder doesn't work, too. Xerces version is 2.7 linked
>         utf8Transcoder = XMLPlatformUtils::fgTransService->makeNewTranscoderFor("UTF-8",
failReason, 16*1024);
>         utf8Transcoder->transcodeFrom( 
> (XMLByte*)childNode->getNodeValue(), XMLString::stringLen( 
> childNode->getNodeValue() ), xmlChars, 1024, eaten, charSizes );

If you're trying to transcode from UTF-16, which is what you have, to UTF-8, you need to call
"transcodeTo()" not "transcodeFrom()".


View raw message