poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From acoli...@apache.org
Subject Re: [poi] Problem with encoding
Date Wed, 09 Nov 2005 12:25:31 GMT
We should be universally handling the issues mentioned here: 
http://en.wikipedia.org/wiki/Windows-1252 by intercepting character 
differences and writing them out properly.  Thus HSSF should force 
8859-1 encoding but should then kind of do a replace on the characters. 
  If someone wants to contribute I can point them in the right direction.

-andy

Christian Gosch wrote:
> Hi,
> 
> that would be of particular interest for me, too.
> 
> We have some international names in our application, although it runs in a
> ISO-Latin-1 (ISO-8859-1) [db, appserver] / Cp1252 [client] environment with
> deDE locale by default.
> 
> We have several areas of "visibility" like DB (VarChar fields), Java source
> files, appserver console, JSP source / rendering / display, PDF and XLS
> download.
> 
> Actually we use the last POI final (should be 2.5.1?), and I do not remember
> any possibility of setting the encoding for String values in a sheet. Since
> the XLS file format is kind of a "hybrid" one, mixed up from binary
> structure / control data and textual content data, it is crucial to fill in
> all textual "content" with the appropriate encoding -- and that one should
> be subject to set up / choose.
> 
> Testing some examples I found that
> - very most characters found in our data are displayed as they should, in
> JSP and XLS (by POI).
> - the czech "s with v on top" is displayed well in JSPs, but not in POI
> generated XLS: There it shows up as "little rectangle".
> I know that in ISO-8859-1 there are also problems with danish "o with slash"
> also, but currently I have no test data. Also I would expect problems with
> turkish letters like "i without dot" or "c with bottom accent", like in the
> city name "Incirlik", when written correctly.
> 
> btw:
> In JXL (JExcelAPI) it is posible to set up an encoding for a generated XLS
> file, which by default is "the default encoding of the hosting VM", but it
> took a while to make that happen.
> 
> 
> Regards
> Christian Gosch
> inovex GmbH
> 
> 
> 
> On Tuesday, November 08, 2005 11:59 PM [GMT+1=CET],
> Olivier Matt <init64@kodee.org> wrote:
> 
> 
>>Hello,
>>
>>I'm reading excel files and I get from a CELL_TYPE_STRING cell a
>>String.
>>
>>That string has some problems with accents (I guess the file is
>>encoded using
>>some latin-characters encoding), they are not seen properly.
>>
>>How can I avoid this behavior ? Can I specify somewhere the encoding
>>of the cells ?
>>Or is there a method for transforming misinterpreted strings to good
>>latin-strings ?
>>
>>
>>Thanks for help,
>>
>>Olivier
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
>>Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
>>The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/
> 
> 


-- 
Andrew C. Oliver
SuperLink Software, Inc.

Java to Excel using POI
http://www.superlinksoftware.com/services/poi
Commercial support including features added/implemented, bugs fixed.


---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Mime
View raw message