pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilad Denneboom <gilad.denneb...@gmail.com>
Subject Re: Merging a lot of small pdf documents (1/2 pages) into one pdf document
Date Mon, 03 Jun 2013 13:01:03 GMT
Try loading the file using a scratch file:
http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/PDDocument.html#load(java.lang.String,%20org.apache.pdfbox.io.RandomAccess)

 This will help lessen the memory load.


On Mon, Jun 3, 2013 at 2:50 PM, mihaela olteanu <mihaela_ol@yahoo.com>wrote:

> Hello,
>
> I have a use case where I need to merge a large number of small pdf
> document (hundred of thousands) into one pdf document.
> Currently I am using the
> method: org.apache.pdfbox.util.PDFMergerUtility.appendDocument(destination,
> source); for all the source documents, not directly mergeDocuments() method
> in the same class because I need to also add some bookmarks. Finally I save
> the document.
>
> Is it a better way of doing this with a lower memory footprint? I tried
> importing each page from the source documents by using the method
> PDDocument.importPage() but still throws errors in version 1.8.2.
>
> When I call PDDocument.load(File) the whole document is loaded in memory?
> If so, it means that saving the generated pdf after merging a subset of
> documents and then reloading it would not decrease the memory use anyway ...
>
> Could somebody point me to the right way of doing this?
>
> Thanks,
> Mihaela

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message