pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Long <jesse.long...@gmail.com>
Subject Re: Scratch files - too many files open
Date Wed, 03 Jun 2015 11:20:52 GMT
On 03/06/2015 12:46, Andreas Lehmkühler wrote:
> Hi,
>
>> Jesse Long <jesse.long.za@gmail.com> hat am 3. Juni 2015 um 08:45 geschrieben:
>>
>>
>> On 02/06/2015 17:48, Andreas Lehmkuehler wrote:
>>> Hi,
>>>
>>> Am 02.06.2015 um 16:15 schrieb Jesse Long:
>>>> Hi All,
>>>>
>>>> Regarding PDFBOX-2301, and the use of scratch files: right now, each
>>>> COSStream
>>>> uses one or two scratch files.
>>>>
>>>> I recently ran into the problem on Linux where the max number of open
>>>> files
>>>> allowed to the JVM by the OS was reached because of this.
>>>>
>>>> Is there a plan around this?
>>>>
>>>> Is it maybe that my use case is not expected?
>>> I'm aware of that. The refactoring is still in progress. I expect to
>>> reduce the number of open files.
>>>
>>>> My use case is:
>>>> Open PDDocument 1
>>>> Open PDDocument 2
>>>> for a few hundred times
>>>>           import page 1 of PDDocument 1 into PDDocument 2 and overlay
>>>> some stuff
>>>> ontop.
>>>> save PDDocument 2.
>>>>
>>>> I have written a patch to use one single java.io.RandomAccessFile as
>>>> a scratch
>>>> file per COSDocument, using pages in a doubly linked list to separate
>>>> streams in
>>>> the same file. Would you be interested in adding this to PDFBox?
>>> To use one file only led to problems when creating pdfs from scratch.
>>> It is possible to write to 2 COSStreams at the same time which
>>> corrupts pdf.
>> Hi Andreas,
>>
>> Do you mean at the same time, as in multiple threads, or single thread
>> writing a bit to this stream and then a bit to another stream back and
>> forth?
> It's about the second case. You can't add fonts and/or images to a page while
> adding content to a contentstream the same time. You have to add those before
> opening a stream or you have to close the stream before
>
>> For the single thread use case, I have solved this in my patch.
>> Actually, even multiple thread should be easy to support with
>> synchronization. I'll work on some docs and submit and you can see if
>> you like it.
> At least it sounds interesting and I'm happy to look at it.
>

Please see patch attached.

Thanks,
Jesse

Mime
View raw message