pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tilman Hausherr <THaush...@t-online.de>
Subject Re: Dump all objects on page with coordinates (images, text, color boxes, lines)
Date Sat, 08 Oct 2016 05:06:32 GMT
Am 07.10.2016 um 17:35 schrieb Christopher Begley:
> Hello All!
> New to PDFBox. My task to to basically map ALL elements on a page of a pdf document.
This includes text, color boxes, highlights, underlines, lines, curves, images, etc.
> Does there exist a way to dump all objects on a page and then retrieve information about
each object? (Specifically, coordinates that can then be mapped to page coordinates in another
file format).
>  From my limited perusal of the documentation, I don't see any obvious/intuitive way
to do this. Can someone point me the right direction on how to approach this problem?

If you want "all", then our tools won't help because they're too narrow. 
Download the sources and run PDFDebugger, and trace 
PDFGraphicsStreamEngine and its parent class.


To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

View raw message