manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1125) All .7z files cause class not found exception in TikaExtractor
Date Thu, 18 Dec 2014 08:26:14 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251370#comment-14251370
] 

Karl Wright commented on CONNECTORS-1125:
-----------------------------------------

This looks like a Tika bug.  I'm awaiting an example .7z file to be attached to confirm, and
potentially open a Tika ticket.  Either that, or we might be missing a jar that Tika needs.


> All .7z files cause class not found exception in TikaExtractor
> --------------------------------------------------------------
>
>                 Key: CONNECTORS-1125
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1125
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 1.7.2
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.9, ManifoldCF 2.1
>
>
> The exception is:
> {code}
> FATAL 2014-12-16 11:12:58,496 (Worker thread '47') - Error tossed: Could not initialize
class org.apache.commons.compress.archivers.sevenz.Coders
> java.lang.NoClassDefFoundError: Could not initialize class org.apache.commons.compress.archivers.sevenz.Coders
>         at org.apache.commons.compress.archivers.sevenz.SevenZFile.readEncodedHeader(SevenZFile.java:279)
>         at org.apache.commons.compress.archivers.sevenz.SevenZFile.readHeaders(SevenZFile.java:191)
>         at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:95)
>         at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:117)
>         at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:130)
>         at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
>         at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
>         at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
>         at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:230)
>         at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3257)
>         at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3108)
>         at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2739)
>         at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:792)
>         at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1610)
>         at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1558)
>         at org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:911)
>         at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:383)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message