struts-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Borut Hadžialić <borut.hadzia...@gmail.com>
Subject Re: Encoding Problem ISO to UTF-8
Date Sat, 14 May 2005 19:28:09 GMT
On 5/14/05, Leon Rosenberg <struts_user@anotheria.net> wrote:
> Hi,
> 
> I have a small encoding problem, which drives me crazy...
> 
> Our complete site is in ISO-8859-1 (which is java-default, as I understand
> it). I mean, the charset of the page is ISO, and meta-tags in HTML are
> telling the
> browser that the page is ISO too.
> Now the problem, that I have, is that I have to transmit some XML data to
> another system (payment provider) which expects it in UTF-8.
> The problem is that customer name can contain Umlauts (german characters:
> äöü), and they come truncated on the other side:
> 
> Like I'm sending "Ümlaut" and the other side gets �mlaut.
> 
> I tried each conversion method I could think of sofar:
> reinitializing the String as new String with reencoding: name = new
> String(name.getBytes("ISO-8859-1"), "UTF-8") (in all combinations)
> Using URLDecoder to decode parameters.
> Using charsetencoded Writer (OutputStreamWriter writer = new
> OutputStreamWriter(outStream, "UTF-8")).
> and so on...
> 
> Can anyone give me a hint?
> 
> This problem is slowly driving me crazy....
> 
> regards
> and thanx in advance
> 
> Leon
> 
> 

I recently needed to send some text encoded as UTF-8 over a TCP/IP
socket. I did it like this and it worked:

import java.nio.*;   //for CharBuffer (and ByteBuffer)
import java.nio.charset.*;  //For Charset and CharsetEncoder

OutputStream os = .. /* initialized somhow*/
String s = "some text to send";  //s can be anything that extends CharSequence

CharsetEncoder ce = Charset.forName("UTF-8").newEncoder();

byte[] b = ce.encode(CharBuffer.wrap(s)).array();
os.write(b, 0, b.length);

In another (production) project I used similar code - with
SocketChannels because they have send methods for sending CharBuffers.

Java Strings and chars are Unicode (size of type char is 2 bytes, 0 - 65535).

-- 
Why?
Because YES!

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@struts.apache.org
For additional commands, e-mail: user-help@struts.apache.org


Mime
View raw message