pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkuehler <andr...@lehmi.de>
Subject Re: Extract Embedded files from pdf using pdfbox in .NET application
Date Tue, 11 Jun 2013 11:44:28 GMT
Am 11.06.2013 07:06, schrieb Ramesh Shrestha:
> Thanks,
>
> The java example link i provided should have been -
> http://svn.apache.org/repos/asf/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/pdmodel/ExtractEmbeddedFiles.java
>
> But your suggestion WORKS.
>
> Now i am able to extract the attached file located in the *attachments tab*but
> *haven't been able to extract the attached file located in page*. I am
> getting null efTree in this case.
>
>          PDDocumentNameDictionary namesDictionary = new
> PDDocumentNameDictionary(pdfDoc.getDocumentCatalog());
>          PDEmbeddedFilesNameTreeNode *efTree *=
> namesDictionary.getEmbeddedFiles();
>
> So now working on it.
Embedded files are always document related. If an embedded file is referenced
on a single page a file attachment annotation is used. Try something like this
to get all annotations of a single page:

List annotations = page.getAnnotations();

The one you are looking for has to be an instance of the class

org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationFileAttachment.

> On Mon, Jun 10, 2013 at 7:38 PM, Andreas Lehmkuehler <andreas@lehmi.de>wrote:
>
>> Hi,
>>
>> Am 10.06.2013 11:22, schrieb Ramesh Shrestha:
>>
>>   Hi,
>>>
>>>
>>>     I am developing .NET Application using pdfbox to extract metadata,
>>> content and attached file from PDF.
>>>
>>> I was able to extract metadata and content, but stuck while extracting
>>> attached/embedded files.
>>>
>>> I have a pdf with embedded/attached doc file and want to retrieve that
>>> file. I have gone through the java example -
>>> http://www.docjar.com/html/**api/org/apache/pdfbox/**examples/pdmodel/**
>>> EmbeddedFiles.java.html<http://www.docjar.com/html/api/org/apache/pdfbox/examples/pdmodel/EmbeddedFiles.java.html>
>>> .
>>>
>>> But while trying to use it in .Net, i got "non generic type
>>> 'java.util.Map'
>>> cannot be used with type arguments" in the following code snippet
>>>
>>> java.util.Map<String, COSObjectable> names = efTree.getNames();
>>>
>>> So, i will be grateful if anybody help me to extract the file from pdf.
>>>
>> I'm not a .NET expert and don't know what may cause that issue. But maybe
>> it is
>> a good idea to just omit the generics and try something like this:
>>
>> java.util.Map names = efTree.getNames();
>>
>>   Thanks in advance.
>>>
>>
>> HTH
>> Andreas Lehmkühler

BR
Andreas Lehmkühler


Mime
View raw message