xerces-c-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Freimann, Mario" <mario.freim...@siemens.com>
Subject AW: Reading german umlaute from xml file
Date Mon, 28 Jul 2008 08:06:08 GMT
Dear David,

thanks for the hint. Switching to transcodeTo() and using ISO-8859-1, the transcoding works
like a charme.

One question related to this. I want to write a xml document with the content

[[xml example]]

<?xml version="1.0" encoding="UTF-16" standalone="no" ?><xtestroot><test>M&#252;nchen</test></xtestroot>

The code I use looks like the following:

[[code]]

        utf8Transcoder = XMLPlatformUtils::fgTransService->makeNewTranscoderFor("ISO-8859-1",
failReason, 16*1024);
        DOMImplementation* _impl2 = DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("LS"));
        DOMDocument* _doc2 = _impl2->createDocument(0, XMLString::transcode("xtestroot"),
0);
        DOMElement* _rootElem2 = _doc2->getDocumentElement();

        DOMElement* nextNode2 = _doc2->createElement(XMLString::transcode("test"));
        _rootElem2->appendChild(nextNode2);
        char* xData = "München";
        XMLCh* xOut = new XMLCh[ strlen(xData) * 6 + 1 ];
        int xRet = utf8Transcoder->transcodeFrom( (const XMLByte*)xData, strlen(xData),
xOut, strlen(xData) * 6 + 1, eaten, charSizes );
        printf("debug xData [%s], eaten [%d], charSizes [%s], xRet [%d], xOut [%s]", xData,
eaten, charSizes, xRet, xOut );
        DOMText* nodeData2 = _doc2->createTextNode( xOut );
        nextNode2->appendChild(nodeData2);

        DOMWriter* theSerializer2 = ((DOMImplementationLS*)_impl2)->createDOMWriter();
        char* text2 = XMLString::transcode(theSerializer2->writeToString(*_doc2));
        printf(text2);
        printf("\n");
        XMLString::release(&text2);

The debugging printf shows that transcodeFrom() did its work. But unfortunately, printing
the text2 variable doesn't give anything. I don't understand why the text isn't printed. Any
help is appreciated.

[[output]]

debug xData [München], eaten [7], charSizes [], xRet [7], xOut [M]


With kind regards,

Mario Freimann


-----Ursprüngliche Nachricht-----
Von: David Bertoni [mailto:dbertoni@apache.org] 
Gesendet: Freitag, 25. Juli 2008 21:57
An: c-users@xerces.apache.org
Betreff: Re: Reading german umlaute from xml file

Freimann, Mario wrote:
> Dear mailing list users,
> 
> I have a xml file I want to process with xerces. I now have a problem with german umlaute.
I extracted the code I use to show the problem. After reading the mailing list I switched
from XMLString::transcode to a utf8 transcoder, but it doesn't work either. The problematic
platform is Red Hat Enterprise Linux (3 32Bit and 5 64Bit). The XMLString transcode works
well on a Solaris 10 platform, the transcoder doesn't work, too. Xerces version is 2.7 linked
statically.
...
> 
>         utf8Transcoder = XMLPlatformUtils::fgTransService->makeNewTranscoderFor("UTF-8",
failReason, 16*1024);
>         utf8Transcoder->transcodeFrom( 
> (XMLByte*)childNode->getNodeValue(), XMLString::stringLen( 
> childNode->getNodeValue() ), xmlChars, 1024, eaten, charSizes );

If you're trying to transcode from UTF-16, which is what you have, to UTF-8, you need to call
"transcodeTo()" not "transcodeFrom()".

Dave

Mime
View raw message