pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: Questions about Streaming PDFs and Form fields
Date Fri, 17 Jan 2014 07:12:05 GMT
Hi Tom,

PDF is not a format which is build sequentially but a Random Access format. In order to lower
the memory consumption you can pass a temp file which will be used to store intermediate data.

Take a look at http://pdfbox.apache.org/docs/1.8.3/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html
especially the load and loadNonSeq (which is the preferred method) description

PDFStreamParser is used internally to parse PDF streams (a PDF internal structure). 


Maruan Sahyoun

Am 17.01.2014 um 04:39 schrieb Tom Kesling <kesling@gmail.com>:

> Hello,
> I would like to ask a few questions about Streaming with PDFBox.
> I use the term Streaming for the lack of a better term.  My code will
> execute in a JEE Container so I need to conserve memory as much as
> possible.
> Goals:
> I want to be able to set form fields in a PDF without loading the PDF into
> memory.
> I would like to stream in the PDF and set the fields as they are
> encountered.
> A new PDF will be streamed to disk with the populated form fields.
> I would also like to be able to read form fields from a PDF without loading
> it into memory.
> I would like to to stream in the PDF and read the fields as they are
> encountered.
> I've messed around with the PDFStreamingParser but I haven't figured out
> how to locate form fields.
> If anyone can give me any guidance or examples of how to do this that would
> help alot.
> Any help is appreciated.
> Thanks,
> T

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message