pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "YE ..." <stephe...@hotmail.com>
Subject Re: PDFbox unable to render Chinese font correctly when converting pdf to images
Date Sat, 19 Aug 2017 12:28:54 GMT
Hi Tilman, 
I am running the conversion on Centos, which doesn't have the two fonts installed. I have
installed google's cjk fonts and in most cases PDFbox shall automatically choose the right
ones for rendering Chinese characters. I will find a way to install ArialUnicodeMS and MicrosoftYaHei
 on centos to see if it works. 
Many thanks,
Fangqiao

发自我的 iPhone

> 在 2017年8月19日,下午5:50,Tilman Hausherr <THausherr@t-online.de> 写道:
> 
> Hello Fangqiao,
> 
> I am able to render that file with PDFBox 2.0.7. You can see it at
> http://imgur.com/a/UOfRl
> 
> In the log I get this:
>  Warning  [PDCIDFontType0] Using fallback ArialUnicodeMS for CID-keyed font AdobeKaitiStd-Regular
>  Warning  [PDCIDFontType0] Using fallback ArialUnicodeMS for CID-keyed font AdobeKaitiStd-Regular
>  Warning  [PDCIDFontType0] Using fallback ArialUnicodeMS for CID-keyed font AdobeKaitiStd-Regular
>  Warning  [PDCIDFontType0] Using fallback ArialUnicodeMS for CID-keyed font AdobeKaitiStd-Regular
>  Warning  [PDCIDFontType0] Using fallback ArialUnicodeMS for CID-keyed font AdobeKaitiStd-Regular
>  Warning  [PDCIDFontType0] Using fallback ArialUnicodeMS for CID-keyed font AdobeSongStd-Light
>  Warning  [PDCIDFontType0] Using fallback MicrosoftYaHei for CID-keyed font STSong-Light
> 
> Your invoice does not have its fonts embedded. The messages indicate that PDFBox has
chosen to use the fonts ArialUnicodeMS and MicrosoftYaHei  to display.
> 
> Either you don't have these fonts installed, or maybe you used an older PDFBox version?
> 
> Tilman
> 
> 
>> Am 19.08.2017 um 09:19 schrieb YE ...:
>> Hi Tilman,
>> 
>> 
>> Thanks for the quick reply. I will check for commercial solutions with font hinting
you mentioned here.
>> 
>> 
>> I have also included the links to the attachments mentioned in my previous email
in case you want to take a closer look.
>> 
>> 
>> PDF file:
>> 
>> 
>> https://shujubiji.cn/uppv/bjPhoto/d5418a966daecda62bcf056ddc1e79c99a4c6546/1503126757317/chinese_invoice.pdf?zid=__itemtoken__fe4bd85ea612752d11b4fdb02ff43c8871f9bc1c
>> 
>> 
>> You need to download it then use a PDF reader to open it so Chinese characters can
be shown correctly.
>> 
>> 
>> Converted image:
>> 
>> https://shujubiji.cn/uppv/bjPhoto/d5418a966daecda62bcf056ddc1e79c99a4c6546/1503126770469/chinese_invoice_1.jpg?zid=__itemtoken__6de80b84edb217bb37a19218bcec1eb78a24bcfc
>> 
>> 
>> Screenshot of the originally PDF displayed in PDF reader correctly:
>> 
>> https://shujubiji.cn/uppv/bjPhoto/d5418a966daecda62bcf056ddc1e79c99a4c6546/1503126791380/screenshot_from_2017_08_17_17_12_03.png?zid=__itemtoken__a7c4267ccd9597273a7d0286cb3353b1541a9ee1
>> 
>> 
>> Best regards,
>> 
>> Fangqiao
>> 
>> ________________________________
>> From: Tilman Hausherr <THausherr@t-online.de>
>> Sent: Friday, August 18, 2017 4:18 PM
>> To: users@pdfbox.apache.org
>> Subject: Re: PDFbox unable to render Chinese font correctly when converting pdf to
images
>> 
>> Hello Fangqiao,
>> 
>> Your files didn't get through, you must upload them to a sharehoster.
>> But I suspect that this is a known problem with chinese fonts, the cause
>> is explained here:
>> https://issues.apache.org/jira/browse/PDFBOX-3293
>> [PDFBOX-3293] Chinese font glyphs with overlapping paths ...<https://issues.apache.org/jira/browse/PDFBOX-3293>
>> issues.apache.org
>> Font glyphs with overlapping paths may be rendered in correctly, especially when
the font size is small. Sadly, the Traditional Chinese edition of Windows bundled ...
>> 
>> 
>> 
>> 
>> How to fix it - by implementing font hinting. Which we haven't done.
>> There is no workaround, sadly. (Except of course use better fonts when
>> creating the PDF).
>> 
>> There are some commercial java products (google for them). At least two
>> of them have implemented font hinting (the others I don't know).
>> 
>> Sorry for not having better news.
>> 
>> Tilman
>> 
>> 
>>> Am 18.08.2017 um 11:56 schrieb YE ...:
>>> Hi,
>>> 
>>> I am from China and using PDFBox to convert pdf files to images. It
>>> worked excellently in most cases. Thanks a lot for the team's great work.
>>> 
>>> 
>>> However recently I used it to convert some invoices in PDF to images
>>> and then some Chinese characters weren't converted correctly. Attached
>>> is a sample PDF file, converted image and a screen shot of the
>>> original PDF opened in PDF reader, which displayed all Chinese correctly.
>>> 
>>> 
>>> I am seeking help from the community:
>>> 
>>> 
>>> - what's the possible cause for the problem?
>>> 
>>> 
>>> I guess that in the original pdf file some Chinese characters' font
>>> wasn't set correctly.
>>> 
>>> 
>>> - how to fix it?
>>> 
>>> 
>>> If the above guess is correct, is there a way to detect correct font
>>> type and set the correct font for conversion?
>>> 
>>> 
>>> - or is there other solution that can fix the problem?
>>> 
>>> 
>>> Many thanks,
>>> 
>>> Fangqiao
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Mime
View raw message