directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niclas Hedhman <>
Subject Re: LDAP protocol implementation and data containing accents
Date Wed, 31 Aug 2005 12:40:29 GMT
On Wednesday 31 August 2005 19:25, Enrique Rodriguez wrote:

Duh! I get the feeling you don't understand encodings.


     String string1 = "Jérôme";
     String string2 = new String( "Jérôme".getBytes( "UTF-8" ), "UTF-8" );
     String string3 = new String( "Jérôme".getBytes( "ISO-8859-1" ), 
"ISO-8859-1" );
     System.out.println( string1.equals( string2 ) );
     System.out.println( string2.equals( string3 ) );
     System.out.println( string3.equals( string1 ) );
catch (UnsupportedEncodingException e)

If you haven't got it; String does not have encoding, it is Unicode (which has 
very little to do with UTF-8/16 encoding). A stream of bytes which represents 
unicode characters has an encoding, and only when you convert that stream of 
bytes to/from String object do you need to apply an encoding. Hence, if you 
have a byte array and you want the String constructor to convert that to a 
String, you need to tell it what encoding is used in the byte array.


View raw message