poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toshiaki Kamoshida <kamoshida.toshi...@future.co.jp>
Subject Re: sheet names and string format read garbled on EBCDIC machine
Date Thu, 10 Apr 2003 04:14:36 GMT

On Wed, 9 Apr 2003 14:59:15 -0400
"Elvira Gurevich" <Elvira_Gurevich@ibi.com> wrote:

> We found out the hard way that the en-us version of excel2000 does not
> encode ALL strings in Unicode, hence this discussion thread. 

Sorry,it is not correct that I said.
If a caracter sequence contains codes only less than U+00FF, it will
be encoded/decoded like"ISO-8859-1" Encoder/Decoder do it in my env,
perpahps in yours,too.
The difference between me and you is,in my country,sequense is encoded
to 16Bit generally because a lot of codes of Japanese character is
greater than U+00FF,and in yours,encoded to 8Bit generally because
codes of your character is less than U+00FF.

I guess,there is no difference about the spec of Excel between EN 
locale and others.Each character sequences is dealed like POI is 
now doing over the world(because if don't do so,it causes a lot of
reports to Bugzilla...POI is used over many countries...GREAT!
Thanks a lot,developers:D!).

I feel,the problem is only some mismatch of selection about 
Encoder/Decoder when character sequences is encoded with "ISO-8859-1",
it is not a Blocker bug on spec,but only a small some mistake;
The incorrect Encoder/Decoder is selected.
The selected Encoder/Decoder is related to default encoding,depends on
Platform and Locale,because the Encoding Indicator is not selected
explicitly when make a String object.
(If you call a constractor of java.lang.String with byte arrays without
Encoding Indicator,it means"The module is not 100% pure java"strictly,
because the behavior depends on OS and Locale...)

Is it same as Mr.Andy and developers think or said or want to say?
And...am I "nosy"?

> I see your reference to the break points in BoundSheetRecord, so maybe
> international versions of Excel are coded differently? 
> Anyway, I guess we should just get these changes in, for now, and see
> what comes up next. 
> How long does it take to get the changes in?

Toshiaki Kamoshida


View raw message