jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: Something funny in the text extraction of PDFs [SEC=UNCLASSIFIED]
Date Tue, 19 Oct 2010 07:33:44 GMT
On Tue, Oct 19, 2010 at 7:11 AM,  <Ross.Dyson@ipaustralia.gov.au> wrote:
> Hello
>
> I am in the middle of loading many, many documents into my jackrabbit 2.1
> instance, and production support have complained to me about constant
> warnings being generated in the logs,
>
> org.apache.pdfbox.cos.COSDocument - java.lang.ClassCastException
> Oct 19 13:13:40 localhost java.lang.ClassCastException
>
> It might be producing a number of these warnings per PDF document, because
> they are popping out at about 10/second.
>
> I have 2 concurrent processes running adding documents.  There is a mixture
> of text-based documents and image only PDFs.
>
> Any insight?  Should I just advise production support to tune the log4j
> properties?

Think you might better want to (cross) post this question to pdfbox as
Jackrabbit uses pdfbox for pdf extraction, and also the exception
points to pdfbox,

http://pdfbox.apache.org/userguide/cookbook.html

Regards Ard

>
> Thanks
>
> Ross.
>
> --
> This message contains privileged and confidential information only
> for use by the intended recipient.  If you are not the intended
> recipient of this message, you must not disseminate, copy or use
> it in any manner.  If you have received this message in error,
> please advise the sender by reply e-mail.  Please ensure all
> e-mail attachments are scanned for viruses prior to opening or
> using.
>
>



-- 
Hippo
Europe  •  Amsterdam  Oosteinde 11  •  1017 WT Amsterdam  •  +31 (0)20 522 4466
USA  • San Francisco  185 H Street Suite B  •  Petaluma CA 94952-5100
•  +1 (707) 773 4646
Canada    •   Montréal  5369 Boulevard St-Laurent  •  Montréal QC H2T
1S5  •  +1 (514) 316 8966
www.onehippo.com  •  www.onehippo.org  •  info@onehippo.com

Mime
View raw message