pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler <andr...@lehmi.de>
Subject Re: Extract vectors
Date Tue, 03 Feb 2009 17:40:25 GMT
Jeremias Maerki schrieb:
> On 03.02.2009 18:05:14 Andreas Lehmkühler wrote:
>>> Well Adobe Acrobat was able to detect the images with it's "Export images" functionality
so I assume they are embedded somehow by an XObject. 
>>>  
>>> I noticed you had an ExtractImages class, would I be able to modify this to extract
vectors?
>>> Would I need it to give me a list of Fill/Stroke/Path data points in order for
it to extract correctly?
>> I suggest to give it a try. If the images are embedded as XObjects
>> ExtractImages should do it.
> 
> No, I've just checked: ExtractImages can only handle PDXObjectImage (i.e.
> bitmap images), not PDXObject of which PDFXObjectForm is a subclass.
Sorry, my fault, I didn't realize that little detail...

But it could be an alternative to modify ExtractImages as follows:

- use resources.getXObjects() instead of resources.getImages()
- iterate through the XObjects filtering with the subtype "Form"
- create PDXObjectForm-objects
- save the stream of the XObject to a file

Andreas

Mime
View raw message