geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick McGuire <rick...@gmail.com>
Subject Any character encoding experts out there?
Date Thu, 23 Feb 2006 15:55:33 GMT
I'm currently trying to sort out a problem with my implementation of the 
MimeUtility class in the javamail specs.  For the 
encodeWord()/decodeWord() methods, my encoding encodes to the same value 
as the Sun implementation, but the decoding is driving me nuts.  I'm 
able to successfully decode this into what should be the correct byte[] 
array, but when used to instantiate the String value, I'm getting a 
bogus character value. 

Playing around with this, I've discovered that the problem seems to be 
occurring with the String constructor, and can be demonstrated without 
using the javamail code at all.  Here is a little snippet that shows the 
problem:

       String longString = "Yada, yada\u0081";

        try {
            byte[] bytes = longString.getBytes("Cp1252");      // get 
the bytes using CP1252

            String newString = new String(bytes, 0, bytes.length, 
"Cp1252");   // create a new string item using the same code page.

            // last char of original is int 129, last char of rebuilt 
string is int 63. 
            System.out.println(">>>>> original string = " + longString + 
" rebuilt string = " + newString);
            System.out.println(">>>>> original string = " + 
(int)longString.charAt(longString.length() - 1) + " rebuilt string = " + 
(int)newString.charAt(longString.length() - 1));
        } catch (Exception e) {
        }

63 is the last value in the byte array after the getBytes() call, and 
the Sun impl of MimeUtility.encodeWord() returns the string
"=?Cp1252?Q?Yada,_yada=3F?=" (0x3F == 63), so the correct value is 
getting extracted.  I'm at a loss to figure out why the round trip coded 
above is corrupting the data.  What am I missing here?

Rick


Mime
View raw message