directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmanuel Lecharny (JIRA)" <>
Subject [jira] Commented: (DIRSNICKERS-112) DERUTF8String does not read and write UTF-8
Date Wed, 12 Oct 2005 13:58:09 GMT
    [ ]

Emmanuel Lecharny commented on DIRSNICKERS-112:

Thanks Tomi.

Actually, this is almost the same error than DIRSNICKERS-108.

We are aware that there are some ploblems handling special characters. The code is under heavy
refactoring to address this issue, and I think that the TwixDecoder will be able to solve
this issue. Tha bad side of TwixDecoder is that it's a pretty young piece of code, and has
its load of bug, too. 

A test case has been wrote to test those umlaut characters (0xF6 in ISO-8859-1), which actually
works well with TwixDecoder.

so, "work in progress"

-- Emmanuel.

> DERUTF8String does not read and write UTF-8
> -------------------------------------------
>          Key: DIRSNICKERS-112
>          URL:
>      Project: Directory ASN1
>         Type: Bug
>   Components: BER Runtime
>     Versions: 0.3.0
>     Reporter: Tomi Keinonen
>     Assignee: Alex Karasulu

> DERUTF8String class does not serialize Strings to byte arrays correctly. Also byte arrays
seem to be deserialized to Strings incorrectly.
> For example character 'รถ' (U+00F6 Latin Small Letter O With Diaeresis) which should
be serialized to 0xC3B6 seems to be serialized to 0xF6. DERUTF8String uses only one byte for
each char. This means only ASCII characters work correctly. Each character which requires
more than one byte causes invalid byte output.
> Class DERUTF8String seems to use incorrectly class DERString:
>     protected static byte[] stringToByteArray( String string )
>     {
>         char[] characters = string.toCharArray();
>         byte[] bytes = new byte[ characters.length ];
>         for ( int ii = 0; ii < characters.length; ii++ )
>         {
>             bytes[ ii ] = (byte)characters[ ii ];
>         }
>         return bytes;
>     }
> This method converts parameter first to char array. Chars are UTF-16 and cannot be casted
to byte directly.  
> Conversion should be done using method getBytes with parameter "UTF-8" of class String.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message