commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert burrell donkin <>
Subject Re: [Betwixt] UTF-8 / UTF-16
Date Sun, 03 Oct 2004 22:12:41 GMT
hi david

ah, the vexed issue of platform dependent encodings :)

betwixt doesn't address this issue directly but AFAIK it shouldn't 
really need to. betwixt deals only with java strings (which are 
unicode) and leaves all matters of encoding to the output streams. the 
output encoding is limited only by the range of writers available. you 
should be able to output UTF-16 from betwixt (providing that your java 
platform supports it) by configuring the writer appropriately before 
it's passed to the BeanWriter.

at the risk of being pedantic, AUIU your statement about UTF-8 is not 
strictly correct: both UTF-16 and UTF-8 are encodings for UNICODE (and 
therefore any character expressible in UTF-16 is also expressible in 
UTF-8). i suspect the problem is with the fact that java's default 
encoding is not UTF-8 and is platform dependent. therefore, unless care 
is taken to explicitly specify the appropriate encoding, the output 
will contain platform dependent encodings for some characters 
(typically the non-latin ones).

- robert

On 1 Oct 2004, at 13:58, David Linsin wrote:

> Hello,
> I'd like to know how Betwixt handles UTF-16 character encoding. The 
> Java's
> XMLEncoder only handles UTF-8 character encoding. None UTF-8 
> characters are
> represented by a platform dependent value. I'd like to know how Betwixt
> handles this.
> Thank you for your help.
> ----------------------
> David Linsin
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message