poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <n...@torchbox.com>
Subject Re: Query regarding encoding used internally by apache POI libraries
Date Wed, 09 Sep 2009 09:44:40 GMT
On Wed, 9 Sep 2009, Som Satpathy wrote:
> "The microsoft file formats generally store text as either US-ASCII or
> UCS-2. The type of the record/block/etc tells you which it is, so we can
> turn that into java (unicode) strings"
> Thanks for the input Nick. But one thing is still not clear, can I 
> encode the text as UTF_8?

No. You need to ensure that by the time you pass the string to POI, it's a 
valid java (unicode) string, and you've done any utf8 decoding as required 
when reading the string in. When getting data out of poi, it'll come out 
as a java unicode string. It's up to you to turn that into utf8 if that's 
what you want to output. I'd suggest you go read a tutorial on working 
with unicode and native character sets in java to clear up your 


To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

View raw message