pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler <andr...@lehmi.de>
Subject Re: PrintTextLocations 1.8 vs 2.0
Date Wed, 16 Mar 2016 10:42:38 GMT
Hi,

> Peter Prusinowski <peter.prusi@gmx.de> hat am 16. März 2016 um 09:52
> geschrieben:
> 
> 
> Good morning,
> 
> thank you for the hints, now I am overwriting showGlyph() and trying to 
> get the value with
> 
>              PDSimpleFont sf = (PDSimpleFont) font;
>              String name = sf.getEncoding().getName(code);
>              sf.getPath(name).getBounds()
> 
> but I am getting the same height, no matter which font size is set. This 
> happens with type1 and truetype fonts. What am I doing wrong ?
The font provides always the same unscaled shapes. You have to take the text
transformation matrix and the font matrix into account. Have a look at
PageDrawer#showFontGlyph to see how to do so.

HTH
Andreas
> 
> Am 07.03.2016 um 18:16 schrieb Tilman Hausherr:
> > Am 07.03.2016 um 11:46 schrieb Peter Prusinowski:
> >> Okay, thank you for information. I tried to get the height with 
> >> getPath(). If its one of the 14 standard fonts, I can get the height 
> >> with PDType1Font.fontName.getPath(text.getUnicode()).getBounds()). 
> >> But I dont know how to get the information from other fonts in a 
> >> generic way. Do you have a hint for me ?
> >
> > It is not available for all fonts. It is available for all 
> > PDSimpleFont objects, except for PDType3Font (which doesn't draw just 
> > vectors).
> >
> > The best would be to look at the source code, at PageDrawer.java
> >
> > createGlyph2D() returns a Glyph2D for the font. That one you can use 
> > for glyph2D.getPathForCharacterCode(code);
> >
> > See also showFontGlyph(), you can override that one in a subclass.
> >
> > Have also a look at showGlyph(), this makes a difference between type3 
> > fonts and others. See also CustomGraphicsStreamEngine.
> >
> > Tilman
> >
> >
> >
> >>
> >> Peter
> >>
> >> Am 06.03.2016 um 17:40 schrieb Tilman Hausherr:
> >>>
> >>> In 1.8, for Standard 14 fonts (yours is) it uses the bounding box of 
> >>> each glyph. In a string, it uses a maximum which it keeps for the 
> >>> string, that results in the weird effect that the "d" is slightly 
> >>> higher. If the string is changed so that another glyph is appended, 
> >>> the larger height is kept.
> >>>
> >>> In 2.0 (and in 1.8 for non standard 14 fonts), it uses 1/2 of the 
> >>> bounding box from the font descriptor. The not-halved bounding box 
> >>> is usually too high.
> >>>
> >>> Anyway, the 1.8 logic would work for you for standard 14 fonts, but 
> >>> not for all other fonts.
> >>>
> >>> So there is no bug in 1.8 not in 2.0.
> >>>
> >>> Tilman
> >>>
> >>> Am 03.03.2016 um 19:05 schrieb Tilman Hausherr:
> >>>> Am 03.03.2016 um 09:11 schrieb Peter Prusinowski:
> >>>>> Okay, I am trying to replace some words in documents and use 
> >>>>> text.height to "delete" these words. Here is an example document
: 
> >>>>> http://workupload.com/file/G8ipDe8j
> >>>>
> >>>> The getHeightDir() is not the best strategy, for the reason I 
> >>>> mentioned yesterday. In your case, you should call getPath() on the

> >>>> glyphs and get the bounding box. Or just get the font bounding box 
> >>>> (there's a method) height, however that one is often too high, so 
> >>>> there's a risk that you blank the line above.
> >>>>
> >>>> But thanks for the file, I'll try to find out why it is different. 
> >>>> The heights in 1.8 are surprising, usually they are never so 
> >>>> "perfect" (as I said yesterday). And for some reason, in 1.8 the 
> >>>> height of the last glyph is slightly different although it is all 
> >>>> in one string.
> >>>>
> >>>> 1.8:
> >>>> String[100.0,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=10.108002]H
> >>>> String[110.108,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=7.784004]e
> >>>> String[117.892006,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=3.8919983]l
> >>>> String[121.784004,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=3.8919983]l
> >>>> String[125.676,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=8.553993]o
> >>>> String[134.23,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=3.8919983]
> >>>> String[138.122,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=13.216003]W
> >>>> String[151.338,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=8.554001]o
> >>>> String[159.892,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=5.445999]r
> >>>> String[165.338,92.0 fs=14.0 xscale=14.0 height=10.052001 
> >>>> space=3.8920004 width=3.8919983]l
> >>>> String[169.23,92.0 fs=14.0 xscale=14.0 *height=10.248001* 
> >>>> space=3.8920004 width=8.554001]d  <========= ???
> >>>>
> >>>> 2.0:
> >>>> String[100.0,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004 
> >>>> width=10.108002]H
> >>>> String[110.108,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=7.7839966]e
> >>>> String[117.892,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=3.8919983]l
> >>>> String[121.784,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=3.8919983]l
> >>>> String[125.675995,92.0 fs=14.0 xscale=14.0 height=8.33 
> >>>> space=3.8920004 width=8.554001]o
> >>>> String[134.23,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004 
> >>>> width=3.8919983]
> >>>> String[138.122,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=13.216003]W
> >>>> String[151.338,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=8.554001]o
> >>>> String[159.892,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=5.445999]r
> >>>> String[165.338,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004

> >>>> width=3.8919983]l
> >>>> String[169.23,92.0 fs=14.0 xscale=14.0 height=8.33 space=3.8920004 
> >>>> width=8.554001]d
> >>>>
> >>>>
> >>>>
> >>>> Tilman
> >>>>
> >>>>>
> >>>>> Peter
> >>>>>
> >>>>> Am 02.03.2016 um 19:24 schrieb Tilman Hausherr:
> >>>>>> Am 02.03.2016 um 14:48 schrieb Peter Prusinowski:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> I have noticed that the PrintTextLocations example in 1.8
and 
> >>>>>>> 2.0 gives different results for text.getHeightDir(). In
1.8 the 
> >>>>>>> value seems to be right, but in 2.0 it is too small. I tried

> >>>>>>> with some PDFBox created documents. Is this a bug ?
> >>>>>>
> >>>>>> Maybe, maybe not. The height is a heuristic value to help with

> >>>>>> text extraction, which is sometimes computed differently in
2.0, 
> >>>>>> and it is usually about the height of an "a". Please upload
the PDF.
> >>>>>>
> >>>>>> Tilman
> >>>>>>
> >>>>>> ---------------------------------------------------------------------

> >>>>>>
> >>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>>>
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message