Return-Path: X-Original-To: apmail-pdfbox-users-archive@www.apache.org Delivered-To: apmail-pdfbox-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7478B179AF for ; Wed, 3 Jun 2015 11:21:45 +0000 (UTC) Received: (qmail 90694 invoked by uid 500); 3 Jun 2015 11:21:45 -0000 Delivered-To: apmail-pdfbox-users-archive@pdfbox.apache.org Received: (qmail 90669 invoked by uid 500); 3 Jun 2015 11:21:45 -0000 Mailing-List: contact users-help@pdfbox.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@pdfbox.apache.org Delivered-To: mailing list users@pdfbox.apache.org Received: (qmail 90657 invoked by uid 99); 3 Jun 2015 11:21:44 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jun 2015 11:21:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 595D9CB064 for ; Wed, 3 Jun 2015 11:21:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.121 X-Spam-Level: X-Spam-Status: No, score=-0.121 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id V2CkccsNCVT8 for ; Wed, 3 Jun 2015 11:21:42 +0000 (UTC) Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 9FDC520C4B for ; Wed, 3 Jun 2015 11:21:41 +0000 (UTC) Received: by wifw1 with SMTP id w1so17820821wif.0 for ; Wed, 03 Jun 2015 04:20:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=OGEKggvC+l04wS3IRGvJiZ2BzW/kDNiXOjKUZB3UKC0=; b=pZZAJVWKmrTGINSv4B+R7Oi+V5TnrzHCCnASTEcqK+InowTiV9WewIKNK8LF/CHOPT kVgMkqYd6JEF6PnY6ZQW82pzML3VD9zfvIVR02+IOL8Yod8gTyaeTsBSs2Hj7aQ9V0xr S53gGCwhCxW11yCS101UH7TBpRJLPRcmdC/bZMp66hTGl53vh1UYqPm8byF7SGiZNlJI EC2ah9QEiwDfaXmoxGj30GXq7i5S4cPrqoWtgrgQxh+jS/QrL5DTml2tKAbXUnjoXQd9 nLzczZea80CV63flnMbziFMsA1om7GTqHsXcnaiAgVxM+UP4is0YPTWuIn/Sa4EhzezG 1fjQ== X-Received: by 10.180.188.109 with SMTP id fz13mr18829076wic.74.1433330456374; Wed, 03 Jun 2015 04:20:56 -0700 (PDT) Received: from [192.168.10.158] ([105.227.113.235]) by mx.google.com with ESMTPSA id pg1sm604334wjb.39.2015.06.03.04.20.53 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Jun 2015 04:20:55 -0700 (PDT) Message-ID: <556EE314.1010204@gmail.com> Date: Wed, 03 Jun 2015 13:20:52 +0200 From: Jesse Long User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: users@pdfbox.apache.org Subject: Re: Scratch files - too many files open References: <556DBA6F.103@gmail.com> <556DD05D.1000202@lehmi.de> <556EA29A.5030403@gmail.com> <1591940318.908920.1433328374439.JavaMail.open-xchange@ptangptang.store> In-Reply-To: <1591940318.908920.1433328374439.JavaMail.open-xchange@ptangptang.store> Content-Type: multipart/mixed; boundary="------------060501090907020201050108" --------------060501090907020201050108 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 03/06/2015 12:46, Andreas Lehmkühler wrote: > Hi, > >> Jesse Long hat am 3. Juni 2015 um 08:45 geschrieben: >> >> >> On 02/06/2015 17:48, Andreas Lehmkuehler wrote: >>> Hi, >>> >>> Am 02.06.2015 um 16:15 schrieb Jesse Long: >>>> Hi All, >>>> >>>> Regarding PDFBOX-2301, and the use of scratch files: right now, each >>>> COSStream >>>> uses one or two scratch files. >>>> >>>> I recently ran into the problem on Linux where the max number of open >>>> files >>>> allowed to the JVM by the OS was reached because of this. >>>> >>>> Is there a plan around this? >>>> >>>> Is it maybe that my use case is not expected? >>> I'm aware of that. The refactoring is still in progress. I expect to >>> reduce the number of open files. >>> >>>> My use case is: >>>> Open PDDocument 1 >>>> Open PDDocument 2 >>>> for a few hundred times >>>> import page 1 of PDDocument 1 into PDDocument 2 and overlay >>>> some stuff >>>> ontop. >>>> save PDDocument 2. >>>> >>>> I have written a patch to use one single java.io.RandomAccessFile as >>>> a scratch >>>> file per COSDocument, using pages in a doubly linked list to separate >>>> streams in >>>> the same file. Would you be interested in adding this to PDFBox? >>> To use one file only led to problems when creating pdfs from scratch. >>> It is possible to write to 2 COSStreams at the same time which >>> corrupts pdf. >> Hi Andreas, >> >> Do you mean at the same time, as in multiple threads, or single thread >> writing a bit to this stream and then a bit to another stream back and >> forth? > It's about the second case. You can't add fonts and/or images to a page while > adding content to a contentstream the same time. You have to add those before > opening a stream or you have to close the stream before > >> For the single thread use case, I have solved this in my patch. >> Actually, even multiple thread should be easy to support with >> synchronization. I'll work on some docs and submit and you can see if >> you like it. > At least it sounds interesting and I'm happy to look at it. > Please see patch attached. Thanks, Jesse --------------060501090907020201050108 Content-Type: text/x-patch; name="pdfbox-scratchfile.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pdfbox-scratchfile.patch" diff --git a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSDocument.java b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSDocument.java index 2317ee1..a1048e0 100644 --- a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSDocument.java +++ b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSDocument.java @@ -25,6 +25,7 @@ import java.util.List; import java.util.Map; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; +import org.apache.pdfbox.io.ScratchFile; import org.apache.pdfbox.pdfparser.PDFObjectStreamParser; /** @@ -74,10 +75,8 @@ public class COSDocument extends COSBase implements Closeable private boolean closed = false; private boolean isXRefStream; - - private final File scratchDirectory; - - private final boolean useScratchFile; + + private ScratchFile scratchFile; /** * Constructor. @@ -102,8 +101,14 @@ public class COSDocument extends COSBase implements Closeable */ public COSDocument(File scratchDir, boolean useScratchFiles) { - scratchDirectory = scratchDir; - useScratchFile = useScratchFiles; + if (useScratchFiles) + { + try { + scratchFile = new ScratchFile(scratchDir); + }catch (IOException e){ + LOG.error("Can't create temp file, using memory buffer instead", e); + } + } } /** @@ -121,7 +126,7 @@ public class COSDocument extends COSBase implements Closeable */ public COSStream createCOSStream() { - return new COSStream( useScratchFile, scratchDirectory); + return new COSStream(scratchFile); } /** @@ -133,7 +138,7 @@ public class COSDocument extends COSBase implements Closeable */ public COSStream createCOSStream(COSDictionary dictionary) { - return new COSStream( dictionary, useScratchFile, scratchDirectory ); + return new COSStream( dictionary, scratchFile ); } /** @@ -424,6 +429,11 @@ public class COSDocument extends COSBase implements Closeable } } } + + if (scratchFile != null){ + scratchFile.close(); + } + closed = true; } } diff --git a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java index a5d6b46..7f73329 100644 --- a/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java +++ b/pdfbox/src/main/java/org/apache/pdfbox/cos/COSStream.java @@ -21,7 +21,6 @@ import java.io.BufferedOutputStream; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.Closeable; -import java.io.File; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; @@ -34,9 +33,9 @@ import org.apache.pdfbox.filter.FilterFactory; import org.apache.pdfbox.io.IOUtils; import org.apache.pdfbox.io.RandomAccess; import org.apache.pdfbox.io.RandomAccessBuffer; -import org.apache.pdfbox.io.RandomAccessFile; import org.apache.pdfbox.io.RandomAccessFileInputStream; import org.apache.pdfbox.io.RandomAccessFileOutputStream; +import org.apache.pdfbox.io.ScratchFile; /** * This class represents a stream object in a PDF document. @@ -70,11 +69,7 @@ public class COSStream extends COSDictionary implements Closeable private RandomAccessFileOutputStream unFilteredStream; private DecodeResult decodeResult; - private File scratchFileFiltered; - private File scratchFileUnfiltered; - - private final boolean scratchFiles; - private final File scratchFileDirectory; + private final ScratchFile scratchFile; /** * Constructor. Creates a new stream with an empty dictionary. @@ -82,7 +77,7 @@ public class COSStream extends COSDictionary implements Closeable */ public COSStream( ) { - this(false, null); + this((ScratchFile)null); } /** @@ -93,43 +88,39 @@ public class COSStream extends COSDictionary implements Closeable */ public COSStream( COSDictionary dictionary ) { - this(dictionary, false, null); + this(dictionary, null); } /** * Constructor. Creates a new stream with an empty dictionary. * - * @param useScratchFiles enables the usage of a scratch file if set to true - * @param scratchDirectory directory to be used to create the scratch file. If null java.io.temp is used instead. + * @param scratchFile scratch file to use. * */ - public COSStream( boolean useScratchFiles, File scratchDirectory ) + public COSStream( ScratchFile scratchFile ) { super(); - scratchFiles= useScratchFiles; - scratchFileDirectory = scratchDirectory; + this.scratchFile = scratchFile; } /** * Constructor. * * @param dictionary The dictionary that is associated with this stream. - * @param useScratchFiles enables the usage of a scratch file if set to true - * @param scratchDirectory directory to be used to create the scratch file. If null java.io.temp is used instead. + * @param scratchFile The scratch file to use. * */ - public COSStream( COSDictionary dictionary, boolean useScratchFiles, File scratchDirectory ) + public COSStream( COSDictionary dictionary, ScratchFile scratchFile ) { super( dictionary ); - scratchFiles= useScratchFiles; - scratchFileDirectory = scratchDirectory; + this.scratchFile = scratchFile; } - private RandomAccess createBuffer(boolean filtered) throws IOException + private RandomAccess createBuffer() throws IOException { - if (scratchFiles) + if (scratchFile != null) { - return createScratchFile(filtered); + return scratchFile.createBuffer(); } else { @@ -138,33 +129,6 @@ public class COSStream extends COSDictionary implements Closeable } /** - * Create a scratch file to be used as buffer. - */ - private RandomAccess createScratchFile(boolean filtered) - { - try - { - if (filtered) - { - deleteFile(scratchFileFiltered); - scratchFileFiltered = File.createTempFile("PDFBox_streamf_", null, scratchFileDirectory); - return new RandomAccessFile(scratchFileFiltered, "rw"); - } - else - { - deleteFile(scratchFileUnfiltered); - scratchFileUnfiltered = File.createTempFile("PDFBox_streamu_", null, scratchFileDirectory); - return new RandomAccessFile(scratchFileUnfiltered, "rw"); - } - } - catch (IOException exception) - { - LOG.error("Can't create temp file, using memory buffer instead", exception); - return new RandomAccessBuffer(); - } - } - - /** * This will get the stream with all of the filters applied. * * @return the bytes of the physical (encoded) stream @@ -374,7 +338,7 @@ public class COSStream extends COSDictionary implements Closeable { if (result == null) { - result = createBuffer(false); + result = createBuffer(); } } else @@ -396,7 +360,7 @@ public class COSStream extends COSDictionary implements Closeable IOUtils.closeQuietly(unFilteredStream); if (destBuffer == null) { - result = createBuffer(false); + result = createBuffer(); } else { @@ -468,7 +432,7 @@ public class COSStream extends COSDictionary implements Closeable IOUtils.closeQuietly(filteredStream); if (destBuffer == null) { - result = createBuffer(true); + result = createBuffer(); } else { @@ -599,7 +563,7 @@ public class COSStream extends COSDictionary implements Closeable { if (filteredBuffer == null) { - filteredBuffer = createBuffer(true); + filteredBuffer = createBuffer(); } else if (clear) { @@ -617,7 +581,7 @@ public class COSStream extends COSDictionary implements Closeable { if (unfilteredBuffer == null) { - unfilteredBuffer = createBuffer(false); + unfilteredBuffer = createBuffer(); } else if (clear) { @@ -647,18 +611,5 @@ public class COSStream extends COSDictionary implements Closeable { filteredBuffer.close(); } - deleteFile(scratchFileFiltered); - deleteFile(scratchFileUnfiltered); - } - - private void deleteFile(File file) throws IOException - { - if (file != null && file.exists()) - { - if (!file.delete()) - { - throw new IOException("Can't delete the temporary scratch file "+file.getAbsolutePath()); - } - } } } diff --git a/pdfbox/src/main/java/org/apache/pdfbox/io/ScratchFile.java b/pdfbox/src/main/java/org/apache/pdfbox/io/ScratchFile.java new file mode 100644 index 0000000..3434c85 --- /dev/null +++ b/pdfbox/src/main/java/org/apache/pdfbox/io/ScratchFile.java @@ -0,0 +1,104 @@ +package org.apache.pdfbox.io; + +import java.io.Closeable; +import java.io.File; +import java.io.IOException; +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.pdfbox.cos.COSStream; + +/** + * A temporary file which can hold multiple buffers of temporary data. A new temporary file is created + * for each new {@link ScratchFile} instance, and is deleted when the {@link ScratchFile} is closed. + *

+ * Multiple buffers can be creating by calling the {@link #createBuffer()} method. + *

+ * The file is split into pages, each page containing a pointer to the previous and next pages. This allows + * for multiple, separate streams in the same file. + * + * @author Jesse Long + */ +public class ScratchFile + implements Closeable +{ + private static final Log LOG = LogFactory.getLog(COSStream.class); + private File file; + private java.io.RandomAccessFile raf; + + /** + * Creates a new scratch file. If a {code scratchFileDirectory} is supplied, then the scratch file is created + * in that directory. + * @param scratchFileDirectory The directory in which to create the scratch file, or {code null} if the scratch + * should be created in the default temporary directory. + * @throws IOException If there was a problem creating a temorary file. + */ + public ScratchFile(File scratchFileDirectory) + throws IOException + { + file = File.createTempFile("PDFBox", ".tmp", scratchFileDirectory); + try { + raf = new java.io.RandomAccessFile(file, "rw"); + }catch (IOException e){ + if (!file.delete()){ + LOG.warn("Error deleting scratch file: " + file.getAbsolutePath()); + } + throw e; + } + } + + /** + * Returns the underlying {@link java.io.RandomAccessFile}. + * @return The underlying {@link java.io.RandomAccessFile}. + */ + java.io.RandomAccessFile getRandomAccessFile() + { + return raf; + } + + /** + * Checks if this scratch file has already been closed. + * If the file has been closed, an {@link IOException} is thrown. + * @throws IOException If the file has already been closed. + */ + void checkClosed() + throws IOException + { + if (raf == null){ + throw new IOException("Scratch file already closed"); + } + } + + /** + * Creates a new buffer in the scratch file. + * @return A new buffer. + * @throws IOException If an error occurred. + */ + public RandomAccess createBuffer() + throws IOException + { + return new ScratchFileBuffer(this); + } + + /** + * Closes and deletes the temporary file. No further interaction with the scratch file or associated buffers + * can happen after this method is called. + * @throws IOException If there was a problem closing or deleting the temporary file. + */ + @Override + public void close() + throws IOException + { + if (raf != null){ + raf.close(); + raf = null; + } + + if (file != null){ + if (file.delete()){ + file = null; + }else{ + throw new IOException("Error deleting scratch file: " + file.getAbsolutePath()); + } + } + } +} diff --git a/pdfbox/src/main/java/org/apache/pdfbox/io/ScratchFileBuffer.java b/pdfbox/src/main/java/org/apache/pdfbox/io/ScratchFileBuffer.java new file mode 100644 index 0000000..c76c3f2 --- /dev/null +++ b/pdfbox/src/main/java/org/apache/pdfbox/io/ScratchFileBuffer.java @@ -0,0 +1,502 @@ +package org.apache.pdfbox.io; + +import java.io.EOFException; +import java.io.IOException; + +/** + * A {@link RandomAccess} implemented as a doubly linked list over multiple pages in a {@link java.io.RandomAccessFile}. + *

+ * Each page is {@link #PAGE_SIZE} bytes, with the first 8 bytes being a pointer to page index ({@code pageOffset / PAGE_SIZE}) + * of the previous page in the buffer, and the last 8 bytes being a pointer to the page index of the next page in the buffer. + * @author Jesse Long + */ +class ScratchFileBuffer + implements RandomAccess +{ + /** + * The size of each page. + */ + private static final int PAGE_SIZE = 4096; + /** + * The underlying scratch file. + */ + private ScratchFile scratchFile; + /** + * The first page in this buffer. + */ + private final long firstPage; + /** + * The number of bytes of content in this buffer. + */ + private long length = 0; + /** + * The index of the page in which the current position of this buffer is in. + */ + private long currentPage; + /** + * The current position of the buffer as an offset in the current page. + */ + private int positionInPage; + /** + * The current position in the space of the whole buffer. + */ + private long positionInBuffer; + + /** + * Creates a new buffer in the provided {@link ScratchFile} + * @param scratchFile The {@link ScratchFile} in which to create the new buffer. + * @throws IOException If there was an error writing to the file. + */ + ScratchFileBuffer(ScratchFile scratchFile) + throws IOException + { + scratchFile.checkClosed(); + + this.scratchFile = scratchFile; + + java.io.RandomAccessFile raf = scratchFile.getRandomAccessFile(); + + /* + * We must allocate a new first page for each new buffer, in case multiple buffers are created at + * the same time, and use the same space. + */ + firstPage = createNewPage(raf); + + /* + * Mark the first page back pointer to -1 to indicate start of buffer. + */ + raf.seek(firstPage * PAGE_SIZE); + raf.writeLong(-1l); + + /* + * Reset variables to beginning of empty buffer. + */ + clear(); + } + + /** + * Checks if this buffer, or the underlying {@link ScratchFile} have been closed, throwing {@link IOException} if so. + * @throws IOException If either this buffer, or the underlying {@link ScratchFile} have been closed. + */ + private void checkClosed() + throws IOException + { + if (scratchFile == null){ + throw new IOException("Scratch file buffer already closed"); + } + + scratchFile.checkClosed(); + } + + /** + * {@inheritDoc} + */ + @Override + public long length() + throws IOException + { + checkClosed(); + return length; + } + + /** + * Allocates a new page, and links the current and the new page. + * @param raf The underlying {@link java.io.RandomAccessFile}. + * @throws IOException If there was an error writing to the file. + */ + private void growToNewPage(java.io.RandomAccessFile raf) + throws IOException + { + long newPage = createNewPage(raf); + + /* + * We should only grow to a new page when previous pages are full. If not, + * links wont work. + */ + if (positionInPage != PAGE_SIZE - 8){ + throw new IOException("Corruption detected in scratch file"); + } + + seekToCurrentPositionInFile(raf); + + raf.writeLong(newPage); + + long previousPage = currentPage; + currentPage = newPage; + positionInPage = 0; + + /* + * write back link to previous page. + */ + seekToCurrentPositionInFile(raf); + raf.writeLong(previousPage); + positionInPage = 8; + } + + /** + * {@inheritDoc} + */ + @Override + public void write(int b) + throws IOException + { + checkClosed(); + + java.io.RandomAccessFile raf = scratchFile.getRandomAccessFile(); + + seekToCurrentPositionInFile(raf); + + if (positionInPage == PAGE_SIZE - 8){ + growToNewPage(raf); + } + + raf.write(b); + + positionInPage++; + positionInBuffer++; + if (positionInBuffer > length){ + length = positionInBuffer; + } + } + + /** + * {@inheritDoc} + */ + @Override + public void write(byte[] b, int off, int len) + throws IOException + { + checkClosed(); + + java.io.RandomAccessFile raf = scratchFile.getRandomAccessFile(); + + seekToCurrentPositionInFile(raf); + + while (len > 0){ + if (positionInPage == PAGE_SIZE - 8){ + growToNewPage(raf); + } + + int availableSpaceInCurrentPage = (PAGE_SIZE - 8) - positionInPage; + + int bytesToWrite = Math.min(len, availableSpaceInCurrentPage); + + raf.write(b, off, bytesToWrite); + + off += bytesToWrite; + len -= bytesToWrite; + positionInPage += bytesToWrite; + positionInBuffer += bytesToWrite; + if (positionInBuffer > length){ + length = positionInBuffer; + } + } + } + + /** + * {@inheritDoc} + */ + @Override + public final void clear() + throws IOException + { + checkClosed(); + length = 0; + currentPage = firstPage; + positionInBuffer = 0; + positionInPage = 8; + } + + /** + * {@inheritDoc} + */ + @Override + public long getPosition() + throws IOException + { + checkClosed(); + return positionInBuffer; + } + + /** + * {@inheritDoc} + */ + @Override + public void seek(long seekToPosition) + throws IOException + { + checkClosed(); + + /* + * Cant seek past end of file. If you want to change implementation, seek to end of file, + * write zero bytes for remaining seek distance. + */ + if (seekToPosition > length){ + throw new EOFException(); + } + + java.io.RandomAccessFile raf = scratchFile.getRandomAccessFile(); + + if (seekToPosition < positionInBuffer){ + if (currentPage != firstPage && seekToPosition < (positionInBuffer / 2)){ + /* + * If we are seeking backwards, and the seek to position is closer to the beginning + * of the buffer than our current position, just go to the start of the buffer and seek + * forward from there. Recurse exactly once. + */ + currentPage = firstPage; + positionInPage = 8; + positionInBuffer = 0; + seek(seekToPosition); + }else{ + while (positionInBuffer - seekToPosition > positionInPage - 8){ + raf.seek(currentPage * PAGE_SIZE); + long previousPage = raf.readLong(); + currentPage = previousPage; + positionInBuffer -= (positionInPage - 8); + positionInPage = PAGE_SIZE - 8; + } + + positionInPage -= (positionInBuffer - seekToPosition); + positionInBuffer = seekToPosition; + } + }else{ + while (seekToPosition - positionInBuffer > (PAGE_SIZE - 8) - positionInPage){ + // seek to 8 bytes from end of current page, to read next page pointer. + raf.seek(((currentPage + 1) * PAGE_SIZE) - 8); + long nextPage = raf.readLong(); + positionInBuffer += (PAGE_SIZE - 8) - positionInPage; + currentPage = nextPage; + positionInPage = 8; + } + + positionInPage += seekToPosition - positionInBuffer; + positionInBuffer = seekToPosition; + } + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isClosed() + { + return scratchFile == null; + } + + /** + * {@inheritDoc} + */ + @Override + public int peek() + throws IOException + { + int result = read(); + if (result != -1){ + rewind(1); + } + return result; + } + + /** + * {@inheritDoc} + */ + @Override + public void rewind(int bytes) + throws IOException + { + seek(positionInBuffer - bytes); + } + + /** + * {@inheritDoc} + */ + @Override + public byte[] readFully(int len) + throws IOException + { + byte[] b = new byte[len]; + + int n = 0; + do { + int count = read(b, n, len - n); + if (count < 0){ + throw new EOFException(); + } + n += count; + } while (n < len); + + return b; + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isEOF() + throws IOException + { + checkClosed(); + return positionInBuffer == length; + } + + /** + * {@inheritDoc} + */ + @Override + public int available() + throws IOException + { + checkClosed(); + return (int)Math.min(length - positionInBuffer, Integer.MAX_VALUE); + } + + private String getDebugDetails() + throws IOException + { + return "page=" + currentPage + ", pagePos=" + positionInPage + ", pos=" + positionInBuffer + ", len=" + length + ", raf=" + scratchFile.getRandomAccessFile().getFilePointer(); + } + + /** + * {@inheritDoc} + */ + @Override + public int read() + throws IOException + { + checkClosed(); + + if (positionInBuffer >= length){ + return -1; + } + + java.io.RandomAccessFile raf = scratchFile.getRandomAccessFile(); + + seekToCurrentPositionInFile(raf); + + if (positionInPage == PAGE_SIZE - 8){ + currentPage = raf.readLong(); + positionInPage = 8; + seekToCurrentPositionInFile(raf); + } + + int retv = raf.read(); + + if (retv >= 0){ + positionInPage++; + positionInBuffer++; + } + + return retv; + } + + /** + * {@inheritDoc} + */ + @Override + public int read(byte[] b) + throws IOException + { + return read(b, 0, b.length); + } + + /** + * {@inheritDoc} + */ + @Override + public int read(byte[] b, int off, int len) + throws IOException + { + checkClosed(); + + if (positionInBuffer >= length){ + return -1; + } + + java.io.RandomAccessFile raf = scratchFile.getRandomAccessFile(); + + seekToCurrentPositionInFile(raf); + + if (positionInPage == PAGE_SIZE - 8){ + currentPage = raf.readLong(); + positionInPage = 8; + seekToCurrentPositionInFile(raf); + } + + len = (int)Math.min(len, length - positionInBuffer); + + int totalBytesRead = 0; + + while (len > 0){ + int availableInThisPage = (PAGE_SIZE - 8) - positionInPage; + + int rdbytes = raf.read(b, off, Math.min(len, availableInThisPage)); + + if (rdbytes < 0){ + throw new IOException("EOF reached before end of scratch file stream"); + } + + if (rdbytes == availableInThisPage){ + currentPage = raf.readLong(); + positionInPage = 8; + seekToCurrentPositionInFile(raf); + }else{ + positionInPage += rdbytes; + } + + totalBytesRead += rdbytes; + positionInBuffer += rdbytes; + off += rdbytes; + len -= rdbytes; + } + + return totalBytesRead; + } + + /** + * {@inheritDoc} + */ + @Override + public void close() + throws IOException + { + scratchFile = null; + } + + /** + * Positions the underlying {@link java.io.RandomAccessFile} to the correct position for use by this buffer. + * @param raf The underlying {@link java.io.RandomAccessFile}. + * @throws IOException If there was a problem seeking in the {@link java.io.RandomAccessFile}. + */ + private void seekToCurrentPositionInFile(java.io.RandomAccessFile raf) + throws IOException + { + long positionInFile = (currentPage * PAGE_SIZE) + positionInPage; + + if (raf.getFilePointer() != positionInFile){ + raf.seek(positionInFile); + } + } + + /** + * Allocates a new page in the temporary file by growing the file, returning the page index of the new page. + * @param raf The underlying {@link java.io.RandomAccessFile}. + * @return The index of the new page. + * @throws IOException If there was an error growing the file. + */ + private static long createNewPage(java.io.RandomAccessFile raf) + throws IOException + { + long fileLen = raf.length(); + + fileLen += PAGE_SIZE; + + if (fileLen % PAGE_SIZE > 0){ + fileLen += PAGE_SIZE - (fileLen % PAGE_SIZE); + } + + raf.setLength(fileLen); + + return (fileLen / PAGE_SIZE) - 1; + } +} --------------060501090907020201050108 Content-Type: text/plain; charset=us-ascii --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org For additional commands, e-mail: users-help@pdfbox.apache.org --------------060501090907020201050108--