axis-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susantha Kumara <susan...@opensource.lk>
Subject Re: platform support: internationalization and EBCDIC vs ASCII
Date Wed, 22 Dec 2004 11:12:03 GMT
John Hawkins wrote:

>
>
>Which String literals are we talking about here please?
>
>
>John Hawkins
>
>
>  
>
Hard coded string literals like what we have in SoapEnvVersions.cpp and 
SoapXXX.cpp s serializing functions

Susantha.

>Susantha Kumara <susantha@opensource.lk> wrote on 22/12/2004 10:34:30:
>
>  
>
>>Nadir Amra wrote:
>>
>>    
>>
>>>Correct me if I am wrong....and sorry for the long note but it is
>>>necessary.
>>>
>>>The AXIS code has a restriction that the locale of the process must be
>>>UTF-8 assumes everything is in UTF-8.  Thus the code works specifically
>>>      
>>>
>in
>  
>
>>>processes where the locale is set to UTF-8 or to a single byte ASCII
>>>character set such as the Latin-1 locales, since the character set is a
>>>subset of UTF-8).  For those locales that are not single byte or UTF-8,
>>>code does not work so well.  Obviously the code does not work on
>>>EBCDIC-based systems such as OS/400.
>>>
>>>I need this restriction removed in version 1.5.
>>>
>>>
>>>
>>>      
>>>
>>+1
>>
>>    
>>
>>>To remove the restriction, the code needs to be sensitive to the locale
>>>      
>>>
>of
>  
>
>>>the process that the client is running in and assume any data received
>>>      
>>>
>>>from the client that is to be passed to a web service is in the
>>    
>>
>character
>  
>
>>>set of the locale, and thus needs to be converted to UTF-8.  Similarly,
>>>any data received from the web service needs to be converted to the
>>>character set of the running process, since the various C-runtime string
>>>      
>>>
>
>  
>
>>>functions are dependent on the locale of the process in order for the
>>>functions to work properly.
>>>
>>>The XML parsers can handle the data coming in from the Web service no
>>>matter what the encoding, and there is no problem on that side of
>>>      
>>>
>things.
>  
>
>>>I am assuming the data obtained by the XML parser is being transcoded to
>>>      
>>>
>
>  
>
>>>UTF-8.
>>>
>>>In addition, there are hard-code literal strings that is assumed to be
>>>      
>>>
>in
>  
>
>>>ASCII.  This would also need to be changed.
>>>
>>>I plan spending a lot of time in the next 4 weeks to get the
>>>infrastructure built into the code to allow the code to run on OS/400.
>>>Hopefully, the work I put in can easily be extended to other platforms
>>>      
>>>
>so
>  
>
>>>that if someone wanted to run in a Japanese locale, it would work with
>>>minor changes.
>>>
>>>My thoughts are that a user can indicate whether transcoding should be
>>>enabled via a configuration property in the property file.  When that
>>>happens, the code will create transcoders to convert data from the
>>>      
>>>
>locale
>  
>
>>>of the process to UTF-8 and from UTF-8 to the locale of the process.  I
>>>still have to investigate if it is possible to use the XML parser
>>>transcoders, or even if that is possible.  I am looking for direction
>>>      
>>>
>from
>  
>
>>>you all to see how what a good implementation would be and where in the
>>>code do you think this support would need to be added.
>>>
>>>As far as the literal strings that should be in Latin-1 character set,
>>>this is easily worked around by putting the string in a buffer and
>>>converted using the PLATFORM_STRTOASC() macro (currently in each
>>>PlatformSpecificXXXX.hpp file).  For ASCII-based systems, these macros
>>>      
>>>
>are
>  
>
>>>identity macros.  In addition, if data in a buffer is known to be in the
>>>      
>>>
>
>  
>
>>>latin-1 character set and needs to be converted to the character set of
>>>the process, PLATFORM_ASCTOSTR() can be used.  Again, for ASCII-based
>>>systems,  these macros are identity macros.  I plan on doing this as a
>>>first stage, which should be a benign change.
>>>
>>>What are your thoughts?
>>>
>>>
>>>
>>>
>>>
>>>      
>>>
>>Two thoughts come in to my mind,
>>
>>1. We have only one Axis build (no different builds for each locale). We
>>decide the locale at runtime (depending on configuration file settings)
>>and we do this transcoding at the parser level using parser's
>>capabilities. But if we do this how do we handle the hard coded string
>>literals (we cannot convert those string literals at runtime. can we ?).
>>One way to handle these string literals in this case is by converting
>>those string literals from ascii to runtime locale at the startup of
>>Axis (probably inside initialize function).
>>
>>2. The alternate solution is to build Axis with different locales and
>>installer decides which one to install (we can have utf-8 to be the
>>default). I think this way the Axis performance is not affected as there
>>is no runtime transcoding (other than in the parser layer).
>>
>>Regards,
>>
>>Susantha.
>>    
>>
>
>
>
>  
>


Mime
View raw message