jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross.Dy...@ipaustralia.gov.au
Subject Something funny in the text extraction of PDFs [SEC=UNCLASSIFIED]
Date Tue, 19 Oct 2010 05:11:16 GMT

I am in the middle of loading many, many documents into my jackrabbit 2.1 
instance, and production support have complained to me about constant 
warnings being generated in the logs, 

org.apache.pdfbox.cos.COSDocument - java.lang.ClassCastException
Oct 19 13:13:40 localhost java.lang.ClassCastException

It might be producing a number of these warnings per PDF document, because 
they are popping out at about 10/second.

I have 2 concurrent processes running adding documents.  There is a 
mixture of text-based documents and image only PDFs.

Any insight?  Should I just advise production support to tune the log4j 


View raw message