pdfbox-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brent Pathakis <bpatha...@utah.gov>
Subject Re: Problem loading large pdf files
Date Wed, 30 Oct 2013 18:02:18 GMT
I tried this:

RandomAccess scratchFile=null;


if (tmpFile.exists()){
tmpFile.delete();
  scratchFile = new RandomAccessFile(tmpFile, "rw");
}else
{
 scratchFile =  new RandomAccessFile(tmpFile, "rw");
}

But eclipse tells me:

Type mismatch: cannot convert from RandomAccessFile to RandomAccess

If I try this:

if (tmpFile.exists()){ tmpFile.delete(); RandomAccess scratchFile = (
RandomAccess) new RandomAccessFile(tmpFile, "rw"); }else { RandomAccess
scratchFile = (RandomAccess) new RandomAccessFile(tmpFile, "rw"); }


Then I this error at run time:

java.lang.ClassCastException: java.io.RandomAccessFile cannot be cast to org
.apache.pdfbox.io.RandomAccess at PDFRedact.main(PDFRedact.java:34)


*Brent Pathakis*
801 536 0041


On Wed, Oct 30, 2013 at 10:55 AM, Gilad Denneboom <gilad.denneboom@gmail.com
> wrote:

> I used this code in one occasion:
>
>         String tmpFilePath =
> System.getProperty("java.io.tmpdir")+File.separator+"scratch.tmp";
>         File tmpFile = new File(tmpFilePath);
>         if (tmpFile.exists())
>             tmpFile.delete();
>         RandomAccess scratchFile = new RandomAccessFile(tmpFile, "rw");
>
>         PDDocument doc = PDDocument.load( filePath, scratchFile );
>
>
>
> On Wed, Oct 30, 2013 at 5:31 PM, Brent Pathakis <bpathakis@utah.gov>
> wrote:
>
> > Thanks. Do you have an example of code using the scratch file?
> > On Oct 30, 2013 9:30 AM, "Gilad Denneboom" <gilad.denneboom@gmail.com>
> > wrote:
> >
> > > Try using a scratch file in the load method of PDDocument.
> > >
> > >
> > > On Wed, Oct 30, 2013 at 3:48 PM, Brent Pathakis <bpathakis@utah.gov>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > >   I'm trying to use PDFbox to load a large pdf document (>1gb):
> > > > [
> > > >                       File inputPdf = new File("c:\\some.pdf");
> > > >    PDFTextStripper stop = new PDFTextStripper ();
> > > >
> > > > FileInputStream fis=null;
> > > >  fis=new FileInputStream(inputPdf);
> > > > pd = PDDocument.load(fis,true);[/CODE]
> > > >
> > > >   This code works fine for smaller pdfs, but only larger ones I'm
> > > getting:
> > > >
> > > >   org.apache.pdfbox.exceptions.WrappedIOException
> > > > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:245)
> > > > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1192)
> > > > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1159)
> > > > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1130)
> > > > at PDFRedact.main(PDFRedact.java:19)
> > > > Caused by: java.lang.IndexOutOfBoundsException: Index: 15625, Size:
> > 15625
> > > > at java.util.ArrayList.RangeCheck(Unknown Source)
> > > > at java.util.ArrayList.get(Unknown Source)
> > > > at
> > >
> org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
> > > > at org.apache.pdfbox.io.RandomAccessFileOutputStream.write(
> > > > RandomAccessFileOutputStream.java:106)
> > > > at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
> > > > at java.io.BufferedOutputStream.flush(Unknown Source)
> > > > at java.io.FilterOutputStream.close(Unknown Source)
> > > > at
> > org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:
> > > > 610)
> > > > at
> > org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:568)
> > > > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:188)
> > > > ... 4 more
> > > >
> > > >
> > > >    Any ideas or help would be appreciated.
> > > >
> > > > *Brent Pathakis*
> > > > 801 536 0041
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message