pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Chinese document: mangled characters, ASCII block code points off by 1
Date Thu, 03 Aug 2017 16:40:54 GMT
Am 02.08.2017 um 00:16 schrieb Zubiri, Tomas:
> Good afternoon,
> http://www.filedropper.com/1308134649
> The document linked above isn’t being read correctly by PDFBox.
> Characters in the ASCII block appear to be off by 1, for example, 
> numbers appear to be one value higher.
> Should I upload this as a bug in JIRA?

Despite you not answering, I was able to guess what you're trying to 
tell us.

1) You are using 1.8.* version. This is not very good in rendering, and 
it can't render the chinese glyphs at all, and the numbers are off by 
one. Use 2.0.7.
2) The 2.0.7 renders the numbers correctly. (The cause in 1.8.* is that 
the internal code is indeed off by one, this is a weirdness in the file 
and a bug in 1.8.*, but not a broken PDF) The chinese glyphs do look 
chinese but in poor quality. This is a known and unsolved problem and is 
described here:


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message