jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Moss" <mos...@googlemail.com>
Subject PdfTextFilter throws IOException on certain PDF documents
Date Wed, 20 Dec 2006 14:19:38 GMT
I'm trying to add the document iBATIS-SqlMaps-2_en.pdf to my repository, but
I think indexing the document fails.  Searching for words within the
document fails to return the document as a result, and checking my logs the
following error is generated.

exception initializing reader
org.apache.jackrabbit.core.query.PdfTextFilter$1: java.io.IOException:
Error: Expected hex number, actual=' 0'
java.lang.Throwable: Warning: You did not close the PDF Document
at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384)
at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
at java.lang.ref.Finalizer.access$100(Finalizer.java:14)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)

I've inserted other PDFs without any problem, but this one seems to be
different.  I believe that it's being generated in the background when
session.save() is called. Any ideas what's going wrong?

The file is attached.

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message