tika-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Priya Kujur (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TIKA-972) Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser .
Date Thu, 09 Aug 2012 19:42:18 GMT
Priya Kujur created TIKA-972:
--------------------------------

             Summary: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser
.
                 Key: TIKA-972
                 URL: https://issues.apache.org/jira/browse/TIKA-972
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.9
         Environment: Core java , Windows server 2003
            Reporter: Priya Kujur


While extracting text from PDF  , Tika throws runtime exception. The exception is not thrown
when java code is executed in windows 7 , but when it is executed on Windows server 2003;
it is found.
This is strange but my devlopment environment is windows 7 and production env is Server2003.
Java being platform independent, this issue is making me crazy.
Any kind of help is much appreciated.
Please check the stack trace:
java.io.IOException:
        at org.apache.tika.parser.ParsingReader.read(ParsingReader.java:271)
        at java.io.BufferedReader.fill(Unknown Source)
        at java.io.BufferedReader.readLine(Unknown Source)
        at java.io.BufferedReader.readLine(Unknown Source)
        at com.servient.utilities.textmanipulation.ReaderUtil.readBuffer(ReaderU
til.java:39)
        at com.servient.mapi.metadata.factory.TikaMetaDataExport.processFile(Tik
aMetaDataExport.java:255)
        at com.servient.mapi.metadata.factory.BaseMetadataExport.process(BaseMet
adataExport.java:37)
        at com.servient.mapi.wrapper.AttachmentWrapper.saveTextMetadataExtract(A
ttachmentWrapper.java:116)
        at com.servient.mapi.wrapper.AttachmentWrapper.process(AttachmentWrapper
.java:40)
        at com.servient.mapi.wrapper.AttachmentWrapper.<init>(AttachmentWrapper.
java:36)
        at com.servient.mapi.wrapper.MessageWrapper.writeCatalog(MessageWrapper.
java:761)
        at com.servient.mapi.wrapper.MessageWrapper.writeCatalog(MessageWrapper.
java:754)
        at com.servient.mapi.wrapper.MessageWrapper.process(MessageWrapper.java:
804)
        at com.servient.mapi.MAPI.main(MAPI.java:190)
Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.pdf.PDFParser@ea0a39
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199
)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197
)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
35)
        at org.apache.tika.parser.ParsingReader$ParsingTask.run(ParsingReader.ja
va:232)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException: Comparison method violates its ge
neral contract!
        at java.util.TimSort.mergeHi(Unknown Source)
        at java.util.TimSort.mergeAt(Unknown Source)
        at java.util.TimSort.mergeCollapse(Unknown Source)
        at java.util.TimSort.sort(Unknown Source)
        at java.util.TimSort.sort(Unknown Source)
        at java.util.Arrays.sort(Unknown Source)
        at java.util.Collections.sort(Unknown Source)
        at org.apache.pdfbox.util.PDFTextStripper.writePage(PDFTextStripper.java
:551)
        at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.ja
va:443)
        at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.j
ava:366)
        at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java
:322)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:89)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message