pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: Width information for rendered glyphs is inconsistent
Date Sat, 22 Nov 2014 18:07:58 GMT
Hi Vadimo,

> Also I found out that the pdf is not having all informations in the font descriptor or
in the font. At least not the one I would expect to see. 
> In particular firstChar, lastChar, fontWeight.

The information available depends on the “Subtype” entry. FirstChar and LastChar are for
Type1 and Type3 fonts only. You have a CIDFontType2. You need to look at: "Table 117 – Entries
in a CIDFont dictionary” on p269 of the ISO 3200 spec.

> - How is the the value 581.055 calculated? 

The font file itself is is also embedded in the PDF and contains the widths (i.e. the entires
in “Widths” or “W” are redundant and are provided so that programs consuming PDF don’t
have to be able to parse TTF fonts).

> - Where is the section in the PDF Spec describing widts. 

It depends on the font format, you want to look at "9.7.4.3 Glyph Metrics in CIDFonts”.

> - Is the spec freely available somewhere?

Yes, here: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
<http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf>

-- John

> On 22 Nov 2014, at 05:13, Vadim Bauer <bauer.vadim@gmail.com> wrote:
> 
> Thank you John, 
> 
> you are right when looking into to descendant font I found more information. I now worked
a bit more with preflight and found the mismatch widths.
> Preflight says eg. current width it is 576 but should be 581.055.
> 
> Also I found out that the pdf is not having all informations in the font descriptor or
in the font. At least not the one I would expect to see. 
> In particular firstChar, lastChar, fontWeight.
> 
> My hopefully last questions are: 
> 
> - How is the the value 581.055 calculated? 
> - Where is the section in the PDF Spec describing widts. 
> - Is the spec freely available somewhere?
> 
> Here I just created a sample file from an website to show the errors output:
> https://www.dropbox.com/s/vyh2ediukfjjcub/sample.pdf?dl=0
> 
> COSDictionary{(COSName{Type}:COSName{Font}) (COSName{Subtype}:COSName{CIDFontType2})
(COSName{BaseFont}:COSName{SourceSansPro-Bold}) (COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe})
(COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) (COSName{FontDescriptor}:COSDictionary{(COSName{Type}:COSName{FontDescriptor})
(COSName{FontName}:COSName{QYAAAA+SourceSansPro-Bold}) (COSName{Flags}:COSInt{4}) (COSName{FontBBox}:COSArray{[COSInt{-231},
COSInt{-383}, COSInt{1223}, COSInt{974}]}) (COSName{ItalicAngle}:COSInt{0}) (COSName{Ascent}:COSInt{984})
(COSName{Descent}:COSInt{-273}) (COSName{CapHeight}:COSInt{984}) (COSName{StemV}:COSInt{50})
(COSName{FontFile2}:COSDictionary{(COSName{Length1}:COSInt{20260}) (COSName{Length}:COSInt{8663})
(COSName{Filter}:COSName{FlateDecode}) }) }) (COSName{CIDToGIDMap}:COSName{Identity}) (COSName{W}:COSArray{[COSInt{0},
COSArray{[COSInt{684}, COSInt{660}, COSInt{567}, COSInt{439}, COSInt{514}, COSInt{395}, COSInt{206},
COSInt{569}, COSInt{380}, COSInt{297}, COSInt{660}, COSInt{274}, COSInt{543}, COSInt{284},
COSInt{569}, COSInt{463}, COSInt{566}, COSInt{563}, COSInt{530}, COSInt{756}, COSInt{543},
COSInt{456}, COSInt{569}, COSInt{552}, COSInt{850}, COSInt{600}, COSInt{530}, COSInt{611},
COSInt{850}, COSInt{551}, COSInt{524}, COSInt{524}, COSInt{524}, COSInt{524}]}]}) }
> 
> 
> COSDictionary{(COSName{Type}:COSName{FontDescriptor}) (COSName{FontName}:COSName{QYAAAA+SourceSansPro-Bold})
(COSName{Flags}:COSInt{4}) (COSName{FontBBox}:COSArray{[COSInt{-231}, COSInt{-383}, COSInt{1223},
COSInt{974}]}) (COSName{ItalicAngle}:COSInt{0}) (COSName{Ascent}:COSInt{984}) (COSName{Descent}:COSInt{-273})
(COSName{CapHeight}:COSInt{984}) (COSName{StemV}:COSInt{50}) (COSName{FontFile2}:COSDictionary{(COSName{Length1}:COSInt{20260})
(COSName{Length}:COSInt{8663}) (COSName{Filter}:COSName{FlateDecode}) }) }
> 
> Regards,
> Vadimo
> 
>> Am 22.11.2014 um 01:21 schrieb John Hewson <john@jahewson.com>:
>> 
>> Vadimo,
>> 
>> Type 0 fonts are special, they are wrappers around a child font, which can be found
in the first element of the DescendantFonts array. Type 0 fonts don’t have a FontDescriptor,
instead it can be found in the child font. Remember, the error message you have is not “Missing
FontDescriptor”, nor is it “Missing width”, so you should expect to find both the FontDescriptor
and Widths present, but just with incorrect values.
>> 
>> Also note that some kinds of font store widths in the “W” entry, instead of “Widths”.
>> 
>> -- John
>> 
>>> On 20 Nov 2014, at 14:42, Vadim Bauer <bauer.vadim@gmail.com> wrote:
>>> 
>>> Hello,
>>> 
>>> I looked at the areas suggested by John and found that the font didn't had a
font descriptor.
>>> So after creating and setting a font descriptor I copied the widths from the
loaded ttf file with the same name.  
>>> 
>>> List<PDPage> allPages = doc.getDocumentCatalog().getAllPages();
>>> for (PDPage page : allPages) {
>>> PDResources pageResources = page.findResources();
>>> Map<String, PDFont> fonts = pageResources.getFonts();
>>> for (PDFont font : fonts.values()) {
>>>    assert font.getFontDescriptor() == null; // font descriptor is null
>>>    PDFontDescriptorDictionary fdDictionary = new PDFontDescriptorDictionary();
>>>    font.setFontDescriptor(fdDictionary);
>>>    List<Float> widths = font.getWidths(); // is null
>>> 
>>>    //loading same font and apply widths
>>>    InputStream isNimbus = getClass().getResourceAsStream("/NimbusSanL-Regu.ttf");
>>>    PDTrueTypeFont ttf = PDTrueTypeFont.loadTTF(doc, isNimbus);
>>>    List<Float> newWidths = ttf.getWidths();//[278.0, 278.0, 355.0, 556.0,
556.0, 889.0, 667.0, 191.0, 333.0, 333.0, 389.0, 584.0,
>>>    font.setWidths(newWidths);
>>> }
>>> }
>>> 
>>> when I opened the pdf acrobat complained that the font NimbusSanL-Regu could
not be loaded. All the characters were dotted.
>>> 
>>> Then I replaced the font in pageResources with the loaded ttf and added under
the same key. But that didn't work as well.
>>> 
>>> Any ideas, how I can recalculate the widths from the given font. Was it a problem
that the font in the pdf is marked as Typ0 and the font with the same name is of type3?  

>>> 
>>> This is the cos object of the given Font.
>>> 
>>> Font COS object.
>>> COSDictionary{(COSName{Type}:COSName{Font}) (COSName{Subtype}:COSName{Type0})
(COSName{BaseFont}:COSName{NimbusSanL-Regu}) (COSName{Encoding}:COSName{Identity-H}) (COSName{DescendantFonts}:COSArray{[COSObject{26,
0}]}) (COSName{ToUnicode}:COSDictionary{(COSName{Length}:COSInt{791}) }) }
>>> 
>>> Best regards,
>>> Vadimo
>>> 
>>>> Am 16.11.2014 um 20:07 schrieb John Hewson <john@jahewson.com>:
>>>> 
>>>> Hi Vadimo
>>>> 
>>>> This error means that the Widths in the embedded font file don’t match
the widths in the FontDescriptor. You’ll need to update whichever is wrong, however PDFBox
can’t edit fonts so you can only use it to update the FontDescriptor width, which may or
may not be what you want. The Widths specifies the width of each glyph and can be found at:
>>>> 
>>>> Page -> Resources -> Font -> FontDescriptor -> Widths
>>>> 
>>>> The manner in which fonts are embedded in PDF is very complex, and this kind
of repair will require that you have a good understanding of the relevant concepts from the
ISO 32000 PDF specification. PDFBox provides the low-level APIs which you need, but you need
to understand PDF in order to use them.
>>>> 
>>>> Thanks
>>>> 
>>>> -- John
>>>> 
>>>>> On 16 Nov 2014, at 08:02, Vadim Bauer <bauer.vadim@gmail.com> wrote:
>>>>> 
>>>>> Hi, 
>>>>> 
>>>>> I have a PDFA where Adobe preflight says 'Width information for rendered
glyphs is inconsistent'
>>>>> I would like to correct that with PDFBox as the PDFs in question has
only this one error.
>>>>> 
>>>>> As I understand I need to get all the text characters(Strings?) in the
PDF and set/(modify or recalculate?).
>>>>> 
>>>>> 
>>>>> Question is how can I achieve this with PDFBox, can someone give me hints
maybe in pseudo code.
>>>>> Currently I am browsing the code but I am quite lost on where to dig.
>>>>> 
>>>>> 
>>>>> Cheers,
>>>>> Vadimo
>>>> 
>>> 
>> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message