manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From msaunier <msaun...@citya.com>
Subject Out of memory, one file bug i think
Date Tue, 24 Jul 2018 09:58:37 GMT
Re Karl,

 

I have an Out of Memory Error today. I think I have an error with a
document. I have this WARNING before crash:

 

------------------------------------------------------------------------

 

WARN 2018-07-24T11:46:22,098 (Worker thread '1') - Tika: Tika exception
extracting: TIKA-198: Illegal IOException from
org.apache.tika.parser.microsoft.OfficeParser@62980adb

org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from
org.apache.tika.parser.microsoft.OfficeParser@62980adb

        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
~[tika-core-1.17.jar:1.17]

        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
~[tika-core-1.17.jar:1.17]

        at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
~[tika-core-1.17.jar:1.17]

        at
org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser
.java:74) ~[mcf-tika-connector.jar:?]

        at
org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceD
ocumentWithException(TikaExtractor.java:235) [mcf-tika-connector.jar:?]

        at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineA
ddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226
) [mcf-agents.jar:?]

        at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineA
ddFanout.sendDocument(IncrementalIngester.java:3077) [mcf-agents.jar:?]

        at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineO
bjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java
:2708) [mcf-agents.jar:?]

        at
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentI
ngest(IncrementalIngester.java:756) [mcf-agents.jar:?]

        at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocu
mentWithException(WorkerThread.java:1583) [mcf-pull-agent.jar:?]

        at
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocu
mentWithException(WorkerThread.java:1548) [mcf-pull-agent.jar:?]

        at
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.pro
cessDocuments(SharedDriveConnector.java:939) [mcf-jcifs-connector.jar:?]

        at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
[mcf-pull-agent.jar:?]

Caused by: java.io.IOException: java.lang.ClassNotFoundException:
org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder

        at
org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:150)
~[?:?]

        at
org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102)
~[?:?]

       at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203)
~[?:?]

        at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
~[?:?]

        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
~[?:?]

        ... 12 more

Caused by: java.lang.ClassNotFoundException:
org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder

        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
~[?:1.8.0_171]

        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
~[?:1.8.0_171]

        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
~[?:1.8.0_171]

        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
~[?:1.8.0_171]

        at
org.apache.poi.poifs.crypt.EncryptionInfo.getBuilder(EncryptionInfo.java:222
) ~[?:?]

        at
org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:148)
~[?:?]

        at
org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102)
~[?:?]

        at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203)
~[?:?]

        at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
~[?:?]

        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
~[?:?]

        ... 12 more

 

I think it's a file, because RAM allocation have a weird behavior. In one
second, ManifoldCF (or Tika) allocate +6Go RAM.

 



 

How Can I find the file?

 

Thanks,

Maxence,


Mime
View raw message