pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Detect Invisible Text (placed by tools which make searchable PDF)
Date Fri, 03 May 2019 15:07:33 GMT
These answers may help:
https://stackoverflow.com/questions/50044892/pdfbox-invisible-text-from-pdftextstripper-not-clip-path-or-color-issue
https://stackoverflow.com/questions/50487520/pdfbox-2-0-invisible-text-from-pdftextstripper

Tilman

Am 03.05.2019 um 17:02 schrieb Luca Loiodice:
> Hello,
>
> I would need to remove (often low quality) invisible text placed on images
> by
> tools which use OCR to make searchable PDF.
>
> We use pdfbox ourselves to make searchable PDF... and we use
> setRenderingMode(RenderingMode.NEITHER); when we place the text to
> make it invisible.We also use pdfbox's text stripper to remove text from
> PDF.
>
> What I am not sure if there is a way for the text stripper to identify the
> characters that
> have been placed as invisible and only remove those in some cases.
>
> Thanks for your help,
> Luca
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message