pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Detecting if PDF contains only/mostly images.
Date Mon, 30 Oct 2017 13:53:10 GMT
Am 30.10.2017 um 14:04 schrieb Lachezar Dobrev:
>    I have to process PDF files, that (supposedly) contain one big image
> per page, which is a result from a Document-Scanner. I'd like to avoid
> performing PDF-To-Image in these cases, and use the underlying image
> instead.
>    I am not well-versed in all things PDF and have no idea how to
> detect if a page has content other than a single image.
>    Please advise.

Please have a look at the ExtractImages.java source code. You can change 
that one to your needs.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message