pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Hewson <j...@jahewson.com>
Subject Re: Question about PDFBox parasing
Date Tue, 27 Sep 2016 22:17:27 GMT

> On 26 Sep 2016, at 16:44, Ali Husain <smahusain@gmail.com> wrote:
> 
> Hello!
> 
> I'm new to PDFBox and I'm trying to extract inline images from a PDF document.
> 
> I'm having trouble with an image that has many parts - here's the breakdown. (Image is
also attached)
> 
> <image.png>
> 
> The XObject with 13 elements is actually one image. They are all different components
of the picture. I'm not able to maintain the order, instead I get each image individually.
> 
> Has anyone had a similar problem? Is there a known solution?

Take a look at the CustomGraphicsStreamEngine example:

https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/rendering/CustomGraphicsStreamEngine.java

You can subclass the image drawing methods if you want to know when/where specific images
are drawn on the page. See the PageDrawer source code for how to calculate the image position
via the current transformation matrix.

— John

> Thank you,
> Ali
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message