poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Gosch" <c.go...@inovex.de>
Subject Re: [poi] Problem with encoding
Date Thu, 10 Nov 2005 06:58:22 GMT
However, there are other Windows localizations over the world which don't
use Cp1252 -- to go to the extremes, look at asian versions supporting
Traditional Chinese, Japanese or the like. Even russian, korean, hebrew are
sold, and they all have completely different charsets -- and the interesting
thing about it is, that I can view / edit files created there on other
localizations! So there *must* be *some* way to (a) encode it and (b) tell,
what the actual encoding is.

Look at the implementation of JXL. It is not my work, but we used it a lot
before POI. In the first versions, there was no idea of supporting something
like non-US chars, but after some weeks of discussion the developer of JXL
got the message, and *did* implement support of different encodings. Since
this product is open source, it should be possible. Look for JExcepAPI

Please, don't let all of us non-US developers in 'good old europe' and
wherever else not starvate by lack of custom encoding support...

Christian Gosch
inovex GmbH

On Wednesday, November 09, 2005 11:07 PM [GMT+1=CET],
acoliver@apache.org <acoliver@apache.org> wrote:

> Excel wants cp1252 for most things...  It just does.  When I get home
> (I'm on the road) I'll look at the dev kit...it may be that by
> changing the codepage record we can handle things a bit nicer, but
> eeez kinda picky about that and regardless of what AIX may support,
> when
> you open the Excel sheet it will be on Windows generally (or a
> semi-emulation of it on Mac/Linux) and you'll have to write it in an
> encoding supported by Excel for Windows...
> -Andy
> Rainer Klute wrote:
>> Am Mittwoch, den 09.11.2005, 07:25 -0500 schrieb acoliver@apache.org:
>>> We should be universally handling the issues mentioned here:
>>> http://en.wikipedia.org/wiki/Windows-1252 by intercepting character
>>> differences and writing them out properly.  Thus HSSF should force
>>> 8859-1 encoding but should then kind of do a replace on the
>>>  characters. If someone wants to contribute I can point them in the
>>> right direction.
>> Um, no. Enforcing ISO 8859-1 as character code would be of limited
>> use only. These reason is that like Windows Codepage 1252 it
>> represents only a limited set of characters. UTF-8 is the preferred
>> character encoding. However, POI should not forbid to create strings
>> in other character encodings, be it ISO 8859-1, cp1252 or whatever.
>> By the way, HPSF does a nice job of supporting a lot of different
>> character encodings. At least there are no problems I am aware of. I
>> suggest you have a look at it.
>> Best regards
>> Rainer Klute
>>                           Rainer Klute IT-Consulting GmbH
>>  Dipl.-Inform.
>>  Rainer Klute             E-Mail:  klute@rainer-klute.de
>>  K??rner Grund 24          Telefon: +49 172 2324824
>> D-44143 Dortmund           Telefax: +49 231 5349423
>> Public key fingerprint: E4E4386515EE0BED5C162FBB5343461584B5A42E

Dipl.-Inform. Christian Gosch
Systems Development
inovex GmbH
Karlsruher Strasse 71
D-75179 Pforzheim
Tel.: +49 (0)72 31 - 31 91 - 85
Fax: +49 (0)72 31 - 31 91 - 91

To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/

View raw message