geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick McGuire <rick...@gmail.com>
Subject Re: Any character encoding experts out there?
Date Thu, 23 Feb 2006 18:36:02 GMT
Andy Piper wrote:
> Well 129 == -63 if you coerce it to a byte. I assume that's the cause 
> of your problem, although I'm struggling to see where this is occuring.
Sigh, I didn't push my investigation far enough.  It turns out the Sun 
impl shows the same character loss if you do the complete encode/decode 
round trip using MimeUtility.  My code was working correctly, but I had 
a unit test problem :-)

Rick
>
> andy
>
> At 03:55 PM 2/23/2006, you wrote:
>> I'm currently trying to sort out a problem with my implementation of 
>> the MimeUtility class in the javamail specs.  For the 
>> encodeWord()/decodeWord() methods, my encoding encodes to the same 
>> value as the Sun implementation, but the decoding is driving me 
>> nuts.  I'm able to successfully decode this into what should be the 
>> correct byte[] array, but when used to instantiate the String value, 
>> I'm getting a bogus character value.
>> Playing around with this, I've discovered that the problem seems to 
>> be occurring with the String constructor, and can be demonstrated 
>> without using the javamail code at all.  Here is a little snippet 
>> that shows the problem:
>>
>>       String longString = "Yada, yada\u0081";
>>
>>        try {
>>            byte[] bytes = longString.getBytes("Cp1252");      // get 
>> the bytes using CP1252
>>
>>            String newString = new String(bytes, 0, bytes.length, 
>> "Cp1252");   // create a new string item using the same code page.
>>
>>            // last char of original is int 129, last char of rebuilt 
>> string is int 63.            System.out.println(">>>>> original 
>> string = " + longString + " rebuilt string = " + newString);
>>            System.out.println(">>>>> original string = " + 
>> (int)longString.charAt(longString.length() - 1) + " rebuilt string = 
>> " + (int)newString.charAt(longString.length() - 1));
>>        } catch (Exception e) {
>>        }
>>
>> 63 is the last value in the byte array after the getBytes() call, and 
>> the Sun impl of MimeUtility.encodeWord() returns the string
>> "=?Cp1252?Q?Yada,_yada=3F?=" (0x3F == 63), so the correct value is 
>> getting extracted.  I'm at a loss to figure out why the round trip 
>> coded above is corrupting the data.  What am I missing here?
>>
>> Rick
>
> _______________________________________________________________________
> Notice:  This email message, together with any attachments, may contain
> information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated
> entities,  that may be confidential,  proprietary,  copyrighted  and/or
> legally privileged, and is intended solely for the use of the individual
> or entity named in this message. If you are not the intended recipient,
> and have received this message in error, please immediately return this
> by email and then delete it.
>


Mime
View raw message