xerces-c-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alberto Massari (JIRA)" <xerces-c-...@xml.apache.org>
Subject [jira] Updated: (XERCESC-1092) Win32Transcoder does not properly transcode ISO-8859-2 and other encodings
Date Tue, 02 Nov 2004 14:01:53 GMT
     [ http://nagoya.apache.org/jira/browse/XERCESC-1092?page=history ]

Alberto Massari updated XERCESC-1092:
-------------------------------------

    Priority: Major

> Win32Transcoder does not properly transcode ISO-8859-2 and other encodings
> --------------------------------------------------------------------------
>
>          Key: XERCESC-1092
>          URL: http://nagoya.apache.org/jira/browse/XERCESC-1092
>      Project: Xerces-C++
>         Type: Bug
>   Components: Utilities
>     Versions: 2.4.0
>  Environment: Operating System: Windows XP
> Platform: PC
>     Reporter: Janus Drozd
>     Assignee: Xerces-C Developers Mailing List
>  Attachments: Win32TransService.cpp
>
> Win32TransService scans the Windows registry for supported charsets and reads 
> the "Codepage" and "InternetEncoding". For many charsets these value are equal, 
> but not for all.
> When a Win32Transcoder object is created for a given charset, the "Codepage" 
> value is stored in the fWinCP member and the "InternetEncoding" value in the 
> fIECP member. Win32Transcoder methods use the fWinCP value and pass it to the 
> Windows API functions like ::MultiByteToWideChar. This is wrong. The fIECP 
> value should be used instead.
> For example when transcoding from the ISO-8859-2 encoding then fWinCP is 1250 
> and fIECP is 28592. Win32Transcoder::transcodeFrom(...) 
> calls ::MultiByteToWideChar(1250, ...). This transcodes from the Windows-1250 
> code page, not from ISO-8859-2, and the result is wrong.
> The proposed patch:
> Replace fWinCP with fIECP in all calls of Windows API functions in all 
> Win32Transcoder methods.
> In Win32Transcoder::transcodeFrom:
> ...............
>   const unsigned int toEat = ::IsDBCSLeadByteEx(fIECP, *inPtr) ? 2 : 1;
>   // Make sure a whol char is in the source
>   if (inPtr + toEat > inEnd)
>       break;
>   // Try to translate this next char and check for an error
>   const unsigned int converted = ::MultiByteToWideChar
>   ( fIECP, MB_PRECOMPOSED | MB_ERR_INVALID_CHARS, (const char*)inPtr, toEat, 
> outPtr, 1);
> ...............
> In Win32Transcoder::transcodeTo:
> ...............
>   const unsigned int bytesStored = ::WideCharToMultiByte
>   (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcPtr, 1, (char*)outPtr, outEnd - 
> outPtr, 0, &usedDef);
> ...............
> In Win32Transcoder::canTranscodeTo:
> ...............
>   const unsigned int bytesStored = ::WideCharToMultiByte
>   (fIECP, WC_COMPOSITECHECK | WC_SEPCHARS, srcBuf, srcCount, tmpBuf, 64, 0, 
> &usedDef);
> ...............

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://nagoya.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Mime
View raw message