pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: embedded files in PDF
Date Wed, 13 Jun 2012 13:04:19 GMT
Have a look at the sources for ExtractText in the recent 1.7.0
release: it now extracts embedded PDFs as well.

http://svn.apache.org/repos/asf/pdfbox/branches/1.7/pdfbox/src/main/java/org/apache/pdfbox/ExtractText.java

Mike McCandless

http://blog.mikemccandless.com

On Wed, Jun 13, 2012 at 8:58 AM, Czech, Christian <c.czech@elo.com> wrote:
> Hello,
>
> how can I extract embedded files from PDF?
>
> Here's is my source:
>
> document = PDDocument.load(inputFile);
>
> PDDocumentNameDictionary names = new PDDocumentNameDictionary( document.getDocumentCatalog()
);
>
> PDEmbeddedFilesNameTreeNode embeddedTree = names.getEmbeddedFiles();
>
> if (embeddedTree == null) {
> System.out.println("Embedded files doesn't exist");
> } else {
>      System.out.println("Size: " + embeddedTree.getKids().size());
> }
>
> Thanks
>
> Christian
>
>
> ________________________________
>
> ELO Digital Office GmbH
> Firmensitz: Heilbronner Strasse 150, 70191 Stuttgart
> Fon: +49 711 806089-0, Fax: +49 711 806089-19, Web: www.elo.com
> Gesch?ftsf?hrer: Karl Heinz Mosbach, Matthias Thiele
> BW-Bank, Konto-Nr. 2089782, BLZ 600 501 01
> Registergericht Stuttgart HRB 15059 - USt-IdNr.: DE812471516

Mime
View raw message