pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maruan Sahyoun <sahy...@fileaffairs.de>
Subject Re: Questions about Streaming PDFs and Form fields
Date Fri, 17 Jan 2014 07:12:05 GMT
Hi Tom,

PDF is not a format which is build sequentially but a Random Access format. In order to lower
the memory consumption you can pass a temp file which will be used to store intermediate data.


Take a look at http://pdfbox.apache.org/docs/1.8.3/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html
especially the load and loadNonSeq (which is the preferred method) description

PDFStreamParser is used internally to parse PDF streams (a PDF internal structure). 

BR

Maruan Sahyoun

Am 17.01.2014 um 04:39 schrieb Tom Kesling <kesling@gmail.com>:

> Hello,
> I would like to ask a few questions about Streaming with PDFBox.
> 
> I use the term Streaming for the lack of a better term.  My code will
> execute in a JEE Container so I need to conserve memory as much as
> possible.
> 
> Goals:
> I want to be able to set form fields in a PDF without loading the PDF into
> memory.
> I would like to stream in the PDF and set the fields as they are
> encountered.
> A new PDF will be streamed to disk with the populated form fields.
> 
> I would also like to be able to read form fields from a PDF without loading
> it into memory.
> I would like to to stream in the PDF and read the fields as they are
> encountered.
> 
> I've messed around with the PDFStreamingParser but I haven't figured out
> how to locate form fields.
> 
> If anyone can give me any guidance or examples of how to do this that would
> help alot.
> 
> Any help is appreciated.
> 
> Thanks,
> T


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message