jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (JCR-764) PdfTextFilter may leave parsed document open in case of errors
Date Sun, 25 Feb 2007 13:46:05 GMT

     [ https://issues.apache.org/jira/browse/JCR-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jukka Zitting resolved JCR-764.

    Resolution: Fixed

Patch applied to svn trunk in revision 511509 with some additional comments.

The new PdfTextExtractor class already covered this case, but would have failed in case an
IOException had been thrown by the cleanup process. In revision 511510 I added a catch for
such cleanup exceptions.

Thanks for the background work on this! To me it seems like this is really a PDFBox bug in
that it fails to do proper cleanup in case an exception gets thrown. I'll see if I can formulate
a good bug report and a patch for PDFBox to avoid such workarounds in Jackrabbit.

> PdfTextFilter may leave parsed document open in case of errors
> --------------------------------------------------------------
>                 Key: JCR-764
>                 URL: https://issues.apache.org/jira/browse/JCR-764
>             Project: Jackrabbit
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.0, 1.0.1, 1.1, 1.1.1, 1.2.1, 1.2.2
>            Reporter: fabrizio giustina
>         Assigned To: Jukka Zitting
>            Priority: Minor
>             Fix For: 1.2.3
>         Attachments: textfilter_close.diff
> In case of errors in a parsed PDF document jackrabbit may fail to properly close the
parsed document. PDFBox will write a stack trace to system out at finalize to warn agains
> this is the resulting log:
> WARN org.apache.jackrabbit.core.query.LazyReader LazyReader.java(read:82) 20.02.2007
15:42:50 exception initializing reader org.apache.jackrabbit.core.query.PdfTextFilter$1: java.io.IOException:
Error: Expected hex number, actual=' 2'
> java.lang.Throwable: Warning: You did not close the PDF Document
>    at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384)
>    at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
>    at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
>    at java.lang.ref.Finalizer.access$100(Finalizer.java:14)
>    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)
> this may happens because the parse() method at
> parser = new PDFParser(new BufferedInputStream(in));
> parser.parse();
> immediately creates a document, but it can throw an exception while processing the file.
> PdfTextFilter should check if parser still holds a document and close it appropriately.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message