xerces-c-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Bertoni <dbert...@apache.org>
Subject Re: XMLString Class and ordinary C-strings
Date Thu, 30 Jul 2009 17:32:48 GMT
Rackl, Robert G wrote:
> Using the DOMCount sample as a guide, I am developing a module with a
> bunch of subroutines to parse and validate an XML file, and then to
> extract data from it. The subroutines are to be called from Fortran, and
> all output to cerr is to be converted to go to status strings in the
> calls from Fortran. I can parse/validate the document by a call from
> Fortran. I am stuck at converting ordinary plain ASCII null-terminated
> C-strings to XMLStrings and vice versa. How is that done, please?
You generally shouldn't convert the XML document data to UTF-16. 
Instead, let the parser determine the encoding from the data itself.

For output, you should probably convert all UTF-16 strings to UTF-8, to 
preserve data fidelity.  This is usually not a problem, because UTF-8 is 
compatible with US-ASCII.

You might have some issues displaying some strings to the console if 
those strings have characters that aren't supported by the current code 
page, but it's the best compromise to maintain data fidelity.

To transcode from UTF-16 to and from UTF-8, you need a UTF-8 transcoder. 
You should search the archives of the user and developer list, as you 
will find other emails that discuss this topic, along with code snippets.


View raw message