pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Lehmkühler <andr...@lehmi.de>
Subject Re: Scratch files - too many files open
Date Wed, 03 Jun 2015 10:46:14 GMT
Hi,

> Jesse Long <jesse.long.za@gmail.com> hat am 3. Juni 2015 um 08:45 geschrieben:
> 
> 
> On 02/06/2015 17:48, Andreas Lehmkuehler wrote:
> > Hi,
> >
> > Am 02.06.2015 um 16:15 schrieb Jesse Long:
> >> Hi All,
> >>
> >> Regarding PDFBOX-2301, and the use of scratch files: right now, each 
> >> COSStream
> >> uses one or two scratch files.
> >>
> >> I recently ran into the problem on Linux where the max number of open 
> >> files
> >> allowed to the JVM by the OS was reached because of this.
> >>
> >> Is there a plan around this?
> >>
> >> Is it maybe that my use case is not expected?
> > I'm aware of that. The refactoring is still in progress. I expect to 
> > reduce the number of open files.
> >
> >> My use case is:
> >> Open PDDocument 1
> >> Open PDDocument 2
> >> for a few hundred times
> >>          import page 1 of PDDocument 1 into PDDocument 2 and overlay 
> >> some stuff
> >> ontop.
> >> save PDDocument 2.
> >>
> >> I have written a patch to use one single java.io.RandomAccessFile as 
> >> a scratch
> >> file per COSDocument, using pages in a doubly linked list to separate 
> >> streams in
> >> the same file. Would you be interested in adding this to PDFBox?
> > To use one file only led to problems when creating pdfs from scratch. 
> > It is possible to write to 2 COSStreams at the same time which 
> > corrupts pdf.
> 
> Hi Andreas,
> 
> Do you mean at the same time, as in multiple threads, or single thread 
> writing a bit to this stream and then a bit to another stream back and 
> forth?
It's about the second case. You can't add fonts and/or images to a page while
adding content to a contentstream the same time. You have to add those before
opening a stream or you have to close the stream before

> For the single thread use case, I have solved this in my patch. 
> Actually, even multiple thread should be easy to support with 
> synchronization. I'll work on some docs and submit and you can see if 
> you like it.
At least it sounds interesting and I'm happy to look at it.


> Thanks,
> Jesse
Thanks for the offer

BR
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Mime
View raw message