manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steph van Schalkwyk (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1518) MCF shutting down when Tika is used
Date Thu, 26 Jul 2018 18:29:00 GMT
Steph van Schalkwyk created CONNECTORS-1518:
-----------------------------------------------

             Summary: MCF shutting down when Tika is used
                 Key: CONNECTORS-1518
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1518
             Project: ManifoldCF
          Issue Type: Bug
          Components: Tika extractor
    Affects Versions: ManifoldCF 2.10
         Environment: Centos 7

Prior to crash:

$free -h
 total used free shared buff/cache available
Mem: 15G 1.8G 12G 98M 1.1G 13G
Swap: 2.0G 0B 2.0G

After crash:

$free -h
 total used free shared buff/cache available
Mem: 15G 10G 4.0G 98M 1.1G 4.4G
Swap: 2.0G 0B 2.0G

 

{{start-options.env.unix :}}
{{-Xss500m}}
{{-Xms1g}}
{{-Xmx8g}}
{{-Dorg.apache.manifoldcf.configfile=./properties.xml}}
{{-Dorg.apache.manifoldcf.jettyshutdowntoken=secret_token}}
{{-cp}}
{{.:./lib/mcf-core.jar:./lib/mcf-agents.jar:./lib/mcf-pull-agent.jar:./lib/mcf-ui-core.jar:./lib/mcf-jetty-runner.jar:./lib/jetty-continuation-9.2.3.v20140905.jar:./lib/jetty-http-9.2.3.v20140905.jar:./lib/jetty-io-9.2.3.v20140905.jar:./lib/jetty-jndi-9.2.3.v20140905.jar:./lib/jetty-jsp-jdt-2.3.3.jar:./lib/jetty-plus-9.2.3.v20140905.jar:./lib/jetty-schemas-3.1.M0.jar:./lib/jetty-security-9.2.3.v20140905.jar:./lib/jetty-server-9.2.3.v20140905.jar:./lib/jetty-servlet-9.2.3.v20140905.jar:./lib/jetty-util-9.2.3.v20140905.jar:./lib/jetty-webapp-9.2.3.v20140905.jar:./lib/jetty-xml-9.2.3.v20140905.jar:./lib/hsqldb-2.3.2.jar:./lib/postgresql-42.1.3.jar:./lib/commons-codec-1.10.jar:./lib/commons-collections-3.2.1.jar:./lib/commons-collections4-4.1.jar:./lib/commons-discovery-0.5.jar:./lib/commons-el-1.0.jar:./lib/commons-exec-1.3.jar:./lib/commons-fileupload-1.2.2.jar:./lib/commons-io-2.5.jar:./lib/commons-lang-2.6.jar:./lib/commons-lang3-3.6.jar:./lib/commons-logging-1.2.jar:./lib/ecj-4.3.1.jar:./lib/gson-2.8.0.jar:./lib/guava-21.0.jar:./lib/httpclient-4.5.3.jar:./lib/httpcore-4.4.6.jar:./lib/jasper-6.0.35.jar:./lib/jasper-el-6.0.35.jar:./lib/javax.servlet-api-3.1.0.jar:./lib/jna-4.1.0.jar:./lib/jna-platform-4.1.0.jar:./lib/json-simple-1.1.1.jar:./lib/jsp-api-2.1-glassfish-2.1.v20091210.jar:./lib/juli-6.0.35.jar:./lib/log4j-1.2-api-2.4.1.jar:./lib/log4j-api-2.4.1.jar:./lib/log4j-core-2.4.1.jar:./lib/mail-1.4.5.jar:./lib/serializer-2.7.1.jar:./lib/slf4j-api-1.7.24.jar:./lib/slf4j-simple-1.7.24.jar:./lib/velocity-1.7.jar:./lib/xalan-2.7.1.jar:./lib/xercesImpl-2.10.0.jar:./lib/xml-apis-1.4.01.jar:./lib/zookeeper-3.4.10.jar:}}
            Reporter: Steph van Schalkwyk


 

 

{{```}}{{Jul 26, 2018 1:21:51 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem}}
{{WARNING: org.xerial's sqlite-jdbc is not loaded.}}
{{Please provide the jar on your classpath to parse sqlite files.}}
{{See tika-parsers/pom.xml for the correct version.}}
{{agents process ran out of memory - shutting down}}
{{java.lang.OutOfMemoryError: Java heap space}}
{{ at java.base/java.util.Arrays.copyOf(Arrays.java:3816)}}
{{ at java.base/java.util.BitSet.ensureCapacity(BitSet.java:338)}}
{{ at java.base/java.util.BitSet.expandTo(BitSet.java:353)}}
{{ at java.base/java.util.BitSet.set(BitSet.java:448)}}
{{ at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:267)}}
{{ at org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)}}
{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
{{ at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)}}
{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
{{ at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)}}
{{ at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)}}
{{ at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)}}
{{ at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)}}
{{ at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)}}
{{ at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)}}
{{ at org.apache.tika.parser.microsoft.TextCell.render(TextCell.java:34)}}
{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processSheet(ExcelExtractor.java:609)}}
{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord(ExcelExtractor.java:392)}}
{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord(ExcelExtractor.java:343)}}
{{ at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord(FormatTrackingHSSFListener.java:92)}}
{{ at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord(HSSFRequest.java:109)}}
{{ at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:179)}}
{{ at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:136)}}
{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile(ExcelExtractor.java:319)}}
{{ at org.apache.tika.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:170)}}
{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:184)}}
{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}
{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
{{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}
{{[Thread-475] INFO org.eclipse.jetty.server.ServerConnector - Stopped ServerConnector@37095ded\{HTTP/1.1}{0.0.0.0:8345}}}
{{[Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@5a6d5a8f\{/mcf-api-service,file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-14189461872304124764.dir/webapp/,UNAVAILABLE}{/opt/manifoldcf/manifoldcf_single/././web/war/mcf-api-service.war}}}
{{[Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@6979efad\{/mcf-authority-service,file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-11619445383548662284.dir/webapp/,UNAVAILABLE}{/opt/manifoldcf/manifoldcf_single/././web/war/mcf-authority-service.war}}}
{{2018-07-26 13:22:47,170 qtp2061226112-492 FATAL Unable to register shutdown hook because
JVM is shutting down. java.lang.IllegalStateException: Cannot add new shutdown hook as this
is not started. Current state: STOPPED}}
{{ at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownCallback(DefaultShutdownCallbackRegistry.java:113)}}
{{ at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:271)}}
{{ at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)}}
{{ at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)}}
{{ at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:146)}}
{{ at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)}}
{{ at org.apache.logging.log4j.LogManager.getContext(LogManager.java:270)}}
{{ at org.apache.log4j.Logger$PrivateManager.getContext(Logger.java:59)}}
{{ at org.apache.log4j.Logger.getLogger(Logger.java:37)}}
{{ at org.apache.velocity.runtime.log.Log4JLogChute.init(Log4JLogChute.java:72)}}
{{ at org.apache.velocity.runtime.log.LogManager.createLogChute(LogManager.java:157)}}
{{ at org.apache.velocity.runtime.log.LogManager.updateLog(LogManager.java:269)}}
{{ at org.apache.velocity.runtime.RuntimeInstance.initializeLog(RuntimeInstance.java:871)}}
{{ at org.apache.velocity.runtime.RuntimeInstance.init(RuntimeInstance.java:262)}}
{{ at org.apache.velocity.runtime.RuntimeInstance.requireInitialization(RuntimeInstance.java:302)}}
{{ at org.apache.velocity.runtime.RuntimeInstance.getTemplate(RuntimeInstance.java:1531)}}
{{ at org.apache.velocity.app.VelocityEngine.mergeTemplate(VelocityEngine.java:343)}}
{{ at org.apache.manifoldcf.ui.i18n.Messages.outputResourceWithVelocity(Messages.java:159)}}
{{ at org.apache.manifoldcf.agents.transformation.tika.Messages.outputResourceWithVelocity(Messages.java:136)}}
{{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.outputSpecificationBody(TikaExtractor.java:544)}}
{{ at org.apache.jsp.editjob_jsp._jspService(editjob_jsp.java:3002)}}
{{ at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)}}
{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}
{{ at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:388)}}
{{ at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)}}
{{ at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)}}
{{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}
{{ at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)}}
{{ at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)}}
{{ at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)}}
{{ at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)}}
{{ at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)}}
{{ at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)}}
{{ at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)}}
{{ at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)}}
{{ at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)}}
{{ at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)}}
{{ at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)}}
{{ at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)}}
{{ at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)}}
{{ at org.eclipse.jetty.server.Server.handle(Server.java:497)}}
{{ at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)}}
{{ at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)}}
{{ at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)}}
{{ at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)}}
{{ at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539)}}
{{ at java.base/java.lang.Thread.run(Thread.java:844)}}{{[Worker thread '35'] WARN org.apache.tika.parser.microsoft.AbstractPOIFSExtractor
- Ignoring unexpected exception while parsing summary entry SummaryInformation}}
{{java.lang.RuntimeException: java.nio.channels.ClosedByInterruptException}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:151)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream.getBlockIterator(NPOIFSStream.java:95)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSDocument.getBlockIterator(NPOIFSDocument.java:179)}}
{{ at org.apache.poi.poifs.filesystem.NDocumentInputStream.<init>(NDocumentInputStream.java:82)}}
{{ at org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:65)}}
{{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)}}
{{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:73)}}
{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:156)}}
{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}
{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
{{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}
{{ at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)}}
{{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)}}
{{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)}}
{{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)}}
{{ at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:448)}}
{{ at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)}}
{{Caused by: java.nio.channels.ClosedByInterruptException}}
{{ at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199)}}
{{ at java.base/sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:388)}}
{{ at org.apache.poi.poifs.nio.FileBackedDataSource.size(FileBackedDataSource.java:137)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getChainLoopDetector(NPOIFSFileSystem.java:627)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:149)}}
{{ ... 21 more}}
{{[Worker thread '35'] WARN org.apache.tika.parser.microsoft.AbstractPOIFSExtractor - Ignoring
unexpected exception while parsing summary entry DocumentSummaryInformation}}
{{java.lang.RuntimeException: java.nio.channels.ClosedChannelException}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:151)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream.getBlockIterator(NPOIFSStream.java:95)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSMiniStore.getBlockAt(NPOIFSMiniStore.java:67)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:142)}}
{{ at org.apache.poi.poifs.filesystem.NDocumentInputStream.readFully(NDocumentInputStream.java:264)}}
{{ at org.apache.poi.poifs.filesystem.NDocumentInputStream.read(NDocumentInputStream.java:162)}}
{{ at org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:127)}}
{{ at org.apache.poi.util.BoundedInputStream.read(BoundedInputStream.java:121)}}
{{ at org.apache.poi.util.BoundedInputStream.read(BoundedInputStream.java:103)}}
{{ at org.apache.poi.util.IOUtils.copy(IOUtils.java:312)}}
{{ at org.apache.poi.util.IOUtils.peekFirstNBytes(IOUtils.java:70)}}
{{ at org.apache.poi.hpsf.PropertySet.isPropertySetStream(PropertySet.java:393)}}
{{ at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:191)}}
{{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)}}
{{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:74)}}
{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:156)}}
{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}
{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
{{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}
{{ at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)}}
{{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)}}
{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)}}
{{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)}}
{{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)}}
{{ at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:448)}}
{{ at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)}}
{{Caused by: java.nio.channels.ClosedChannelException}}
{{ at java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:158)}}
{{ at java.base/sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:373)}}
{{ at org.apache.poi.poifs.nio.FileBackedDataSource.size(FileBackedDataSource.java:137)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getChainLoopDetector(NPOIFSFileSystem.java:627)}}
{{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:149)}}
{{ ... 30 more}}{{ }}{{```}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message