poi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject [Bug 54084] Some Unicode chars(e.g chinese chars) are not written corectly in xlsx file.
Date Sun, 30 Jun 2013 22:51:59 GMT
https://issues.apache.org/bugzilla/show_bug.cgi?id=54084

Dominik Stadler <dominik.stadler@gmx.at> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEEDINFO                    |NEW

--- Comment #9 from Dominik Stadler <dominik.stadler@gmx.at> ---
I worked on reproducing the reported problems with greek characters. This seems
to happen when loading shared strings from the XLSX file. The XML file is
encoded correctly (UTF-8 codes e.g. from
http://www.fileformat.info/info/unicode/char/1d74a/index.htm) and characters
appear in OpenOffice and when opening the file in a text-editor.

Also initial loading of the Workbook using XSSF works, the cell contains the
necessary data, however after writing out the data and reading back in, it does
not match any more.

As far as I see, the shared-strings are read incorrectly and thus break the
writing of the data back out.

I could debug the code as far as xmlbeans handles the string where it seems to
be fine, but as soon as the SstDocumentImpl takes over, it seems to become
corrupted, however debugging there is not possible for me currently because the
.class files are stripped... :(

I have for now added a testcase to the special test-class TestUnfixedBugs.java
called testBug54084Unicode() which verifies the problem, no fix available
yet...

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


Mime
View raw message