manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1518) MCF shutting down when Tika is used
Date Fri, 27 Jul 2018 05:15:01 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559269#comment-16559269
] 

Karl Wright commented on CONNECTORS-1518:
-----------------------------------------

[~svanschalkwyk], we don't control how much memory Tika takes to do its content extraction.
 All we can guarantee is that we feed the content to Tika in streamed form.  In some cases
it will use more memory and may need to load the entire document into memory.

The amount of memory you should give MCF when Tika is involved is therefore a function of
your largest document (hopefully controlled by Allowed Documents filtering) times the number
of worker threads you have allocated, plus some constant amount for overhead.

You can perhaps prove this to yourself better by setting up a Tika service and using the Tika
external transformer instead.


> MCF shutting down when Tika is used
> -----------------------------------
>
>                 Key: CONNECTORS-1518
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1518
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 2.10
>         Environment: Centos 7
> Prior to crash:
> $free -h
>  total used free shared buff/cache available
> Mem: 15G 1.8G 12G 98M 1.1G 13G
> Swap: 2.0G 0B 2.0G
> After crash:
> $free -h
>  total used free shared buff/cache available
> Mem: 15G 10G 4.0G 98M 1.1G 4.4G
> Swap: 2.0G 0B 2.0G
>  
> {{start-options.env.unix :}}
> {{-Xss500m}}
> {{-Xms1g}}
> {{-Xmx8g}}
> {{-Dorg.apache.manifoldcf.configfile=./properties.xml}}
> {{-Dorg.apache.manifoldcf.jettyshutdowntoken=secret_token}}
> {{-cp}}
> {{.:./lib/mcf-core.jar:./lib/mcf-agents.jar:./lib/mcf-pull-agent.jar:./lib/mcf-ui-core.jar:./lib/mcf-jetty-runner.jar:./lib/jetty-continuation-9.2.3.v20140905.jar:./lib/jetty-http-9.2.3.v20140905.jar:./lib/jetty-io-9.2.3.v20140905.jar:./lib/jetty-jndi-9.2.3.v20140905.jar:./lib/jetty-jsp-jdt-2.3.3.jar:./lib/jetty-plus-9.2.3.v20140905.jar:./lib/jetty-schemas-3.1.M0.jar:./lib/jetty-security-9.2.3.v20140905.jar:./lib/jetty-server-9.2.3.v20140905.jar:./lib/jetty-servlet-9.2.3.v20140905.jar:./lib/jetty-util-9.2.3.v20140905.jar:./lib/jetty-webapp-9.2.3.v20140905.jar:./lib/jetty-xml-9.2.3.v20140905.jar:./lib/hsqldb-2.3.2.jar:./lib/postgresql-42.1.3.jar:./lib/commons-codec-1.10.jar:./lib/commons-collections-3.2.1.jar:./lib/commons-collections4-4.1.jar:./lib/commons-discovery-0.5.jar:./lib/commons-el-1.0.jar:./lib/commons-exec-1.3.jar:./lib/commons-fileupload-1.2.2.jar:./lib/commons-io-2.5.jar:./lib/commons-lang-2.6.jar:./lib/commons-lang3-3.6.jar:./lib/commons-logging-1.2.jar:./lib/ecj-4.3.1.jar:./lib/gson-2.8.0.jar:./lib/guava-21.0.jar:./lib/httpclient-4.5.3.jar:./lib/httpcore-4.4.6.jar:./lib/jasper-6.0.35.jar:./lib/jasper-el-6.0.35.jar:./lib/javax.servlet-api-3.1.0.jar:./lib/jna-4.1.0.jar:./lib/jna-platform-4.1.0.jar:./lib/json-simple-1.1.1.jar:./lib/jsp-api-2.1-glassfish-2.1.v20091210.jar:./lib/juli-6.0.35.jar:./lib/log4j-1.2-api-2.4.1.jar:./lib/log4j-api-2.4.1.jar:./lib/log4j-core-2.4.1.jar:./lib/mail-1.4.5.jar:./lib/serializer-2.7.1.jar:./lib/slf4j-api-1.7.24.jar:./lib/slf4j-simple-1.7.24.jar:./lib/velocity-1.7.jar:./lib/xalan-2.7.1.jar:./lib/xercesImpl-2.10.0.jar:./lib/xml-apis-1.4.01.jar:./lib/zookeeper-3.4.10.jar:}}
>            Reporter: Steph van Schalkwyk
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 2.11
>
>         Attachments: CONNECTORS-1518.patch
>
>
>   ```Jul 26, 2018 1:21:51 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem
>  WARNING: org.xerial's sqlite-jdbc is not loaded.
>  Please provide the jar on your classpath to parse sqlite files.
>  See tika-parsers/pom.xml for the correct version.
>  agents process ran out of memory - shutting down
>  java.lang.OutOfMemoryError: Java heap space
>  \{{ {{ at java.base/java.util.Arrays.copyOf(Arrays.java:3816)}}}}
>  \{{ {{ at java.base/java.util.BitSet.ensureCapacity(BitSet.java:338)}}}}
>  \{{ {{ at java.base/java.util.BitSet.expandTo(BitSet.java:353)}}}}
>  \{{ {{ at java.base/java.util.BitSet.set(BitSet.java:448)}}}}
>  \{{ {{ at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:267)}}}}
>  \{{ {{ at org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)}}}}
>  \{{ {{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}}}
>  \{{ {{ at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)}}}}
>  \{{ {{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}}}
>  \{{ {{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}}}
>  \{{ {{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}}}
>  \{{ {{ at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)}}}}
>  \{{ {{ at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)}}}}
>  \{{ {{ at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)}}}}
>  \{{ {{ at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)}}}}
>  \{{ {{ at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)}}}}
>  \{{ {{ at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.TextCell.render(TextCell.java:34)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processSheet(ExcelExtractor.java:609)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord(ExcelExtractor.java:392)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord(ExcelExtractor.java:343)}}}}
>  \{{ {{ at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord(FormatTrackingHSSFListener.java:92)}}}}
>  \{{ {{ at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord(HSSFRequest.java:109)}}}}
>  \{{ {{ at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:179)}}}}
>  \{{ {{ at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:136)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile(ExcelExtractor.java:319)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:170)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:184)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}}}
>  \{{ {{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}}}
>  \{{ {{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}}}
>  \{{ {{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}}}
>  {{ [Thread-475] INFO org.eclipse.jetty.server.ServerConnector - Stopped ServerConnector@37095ded\{HTTP/1.1}{{
> {0.0.0.0:8345}
> }}}}
>  {{ {{[Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@5a6d5a8f
> {/mcf-api-service,[file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-14189461872304124764.dir/webapp/,UNAVAILABLE|file:///tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-14189461872304124764.dir/webapp/,UNAVAILABLE]}
> }}{{
> {/opt/manifoldcf/manifoldcf_single/././web/war/mcf-api-service.war}}}}}
>  {{ [Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@6979efad{/mcf-authority-service,[file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-11619445383548662284.dir/webapp/,UNAVAILABLE|file:///tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-11619445383548662284.dir/webapp/,UNAVAILABLE]}\{/opt/manifoldcf/manifoldcf_single/././web/war/mcf-authority-service.war}}}
>  2018-07-26 13:22:47,170 qtp2061226112-492 FATAL Unable to register shutdown hook because
JVM is shutting down. java.lang.IllegalStateException: Cannot add new shutdown hook as this
is not started. Current state: STOPPED
>  \{{ {{ at org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownCallback(DefaultShutdownCallbackRegistry.java:113)}}}}
>  \{{ {{ at org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:271)}}}}
>  \{{ {{ at org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)}}}}
>  \{{ {{ at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)}}}}
>  \{{ {{ at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:146)}}}}
>  \{{ {{ at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)}}}}
>  \{{ {{ at org.apache.logging.log4j.LogManager.getContext(LogManager.java:270)}}}}
>  \{{ {{ at org.apache.log4j.Logger$PrivateManager.getContext(Logger.java:59)}}}}
>  \{{ {{ at org.apache.log4j.Logger.getLogger(Logger.java:37)}}}}
>  \{{ {{ at org.apache.velocity.runtime.log.Log4JLogChute.init(Log4JLogChute.java:72)}}}}
>  \{{ {{ at org.apache.velocity.runtime.log.LogManager.createLogChute(LogManager.java:157)}}}}
>  \{{ {{ at org.apache.velocity.runtime.log.LogManager.updateLog(LogManager.java:269)}}}}
>  \{{ {{ at org.apache.velocity.runtime.RuntimeInstance.initializeLog(RuntimeInstance.java:871)}}}}
>  \{{ {{ at org.apache.velocity.runtime.RuntimeInstance.init(RuntimeInstance.java:262)}}}}
>  \{{ {{ at org.apache.velocity.runtime.RuntimeInstance.requireInitialization(RuntimeInstance.java:302)}}}}
>  \{{ {{ at org.apache.velocity.runtime.RuntimeInstance.getTemplate(RuntimeInstance.java:1531)}}}}
>  \{{ {{ at org.apache.velocity.app.VelocityEngine.mergeTemplate(VelocityEngine.java:343)}}}}
>  \{{ {{ at org.apache.manifoldcf.ui.i18n.Messages.outputResourceWithVelocity(Messages.java:159)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.transformation.tika.Messages.outputResourceWithVelocity(Messages.java:136)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.outputSpecificationBody(TikaExtractor.java:544)}}}}
>  \{{ {{ at org.apache.jsp.editjob_jsp._jspService(editjob_jsp.java:3002)}}}}
>  \{{ {{ at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)}}}}
>  \{{ {{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}}}
>  \{{ {{ at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:388)}}}}
>  \{{ {{ at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)}}}}
>  \{{ {{ at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)}}}}
>  \{{ {{ at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)}}}}
>  \{{ {{ at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)}}}}
>  \{{ {{ at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)}}}}
>  \{{ {{ at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)}}}}
>  \{{ {{ at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)}}}}
>  \{{ {{ at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)}}}}
>  \{{ {{ at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:52)}}}}
>  \{{ {{ at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)}}}}
>  \{{ {{ at org.eclipse.jetty.server.Server.handle(Server.java:497)}}}}
>  \{{ {{ at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)}}}}
>  \{{ {{ at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)}}}}
>  \{{ {{ at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)}}}}
>  \{{ {{ at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)}}}}
>  \{{ {{ at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539)}}}}
>  \{{ {{ at java.base/java.lang.Thread.run(Thread.java:844)}}}}[Worker thread '35'] WARN
org.apache.tika.parser.microsoft.AbstractPOIFSExtractor - Ignoring unexpected exception while
parsing summary entry SummaryInformation
>  java.lang.RuntimeException: java.nio.channels.ClosedByInterruptException
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:151)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream.getBlockIterator(NPOIFSStream.java:95)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSDocument.getBlockIterator(NPOIFSDocument.java:179)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NDocumentInputStream.<init>(NDocumentInputStream.java:82)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:65)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:73)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:156)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}}}
>  \{{ {{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}}}
>  \{{ {{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}}}
>  \{{ {{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:448)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)}}}}
>  Caused by: java.nio.channels.ClosedByInterruptException
>  \{{ {{ at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199)}}}}
>  \{{ {{ at java.base/sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:388)}}}}
>  \{{ {{ at org.apache.poi.poifs.nio.FileBackedDataSource.size(FileBackedDataSource.java:137)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getChainLoopDetector(NPOIFSFileSystem.java:627)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:149)}}}}
>  \{{ {{ ... 21 more}}}}
>  [Worker thread '35'] WARN org.apache.tika.parser.microsoft.AbstractPOIFSExtractor -
Ignoring unexpected exception while parsing summary entry DocumentSummaryInformation
>  java.lang.RuntimeException: java.nio.channels.ClosedChannelException
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:151)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream.getBlockIterator(NPOIFSStream.java:95)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSMiniStore.getBlockAt(NPOIFSMiniStore.java:67)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:169)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.next(NPOIFSStream.java:142)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NDocumentInputStream.readFully(NDocumentInputStream.java:264)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NDocumentInputStream.read(NDocumentInputStream.java:162)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:127)}}}}
>  \{{ {{ at org.apache.poi.util.BoundedInputStream.read(BoundedInputStream.java:121)}}}}
>  \{{ {{ at org.apache.poi.util.BoundedInputStream.read(BoundedInputStream.java:103)}}}}
>  \{{ {{ at org.apache.poi.util.IOUtils.copy(IOUtils.java:312)}}}}
>  \{{ {{ at org.apache.poi.util.IOUtils.peekFirstNBytes(IOUtils.java:70)}}}}
>  \{{ {{ at org.apache.poi.hpsf.PropertySet.isPropertySetStream(PropertySet.java:393)}}}}
>  \{{ {{ at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:191)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaryEntryIfExists(SummaryExtractor.java:83)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.SummaryExtractor.parseSummaries(SummaryExtractor.java:74)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:156)}}}}
>  \{{ {{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}}}
>  \{{ {{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}}}
>  \{{ {{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}}}
>  \{{ {{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)}}}}
>  \{{ {{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:448)}}}}
>  \{{ {{ at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)}}}}
>  Caused by: java.nio.channels.ClosedChannelException
>  \{{ {{ at java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:158)}}}}
>  \{{ {{ at java.base/sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:373)}}}}
>  \{{ {{ at org.apache.poi.poifs.nio.FileBackedDataSource.size(FileBackedDataSource.java:137)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.getChainLoopDetector(NPOIFSFileSystem.java:627)}}}}
>  \{{ {{ at org.apache.poi.poifs.filesystem.NPOIFSStream$StreamBlockByteBufferIterator.<init>(NPOIFSStream.java:149)}}}}
>  \{{ {{ ... 30 more}}}} ```}}{{Following up:When these exceptions occur, the heap runs
out:13:39:39.856 [Worker thread '49'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:39.970 [Worker thread '43'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:40.415 [Worker thread '34'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:40.469 [Worker thread '1'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:43.739 [Worker thread '32'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:44.697 [Worker thread '43'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:45.756 [Worker thread '33'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:45.775 [Worker thread '36'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:46.751 [Worker thread '35'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:46.753 [Worker thread '40'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:47.536 [Worker thread '45'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:48.734 [Worker thread '44'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:50.922 [Worker thread '30'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:54.930 [Worker thread '28'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:40:33.660 [Worker thread '29'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  agents process ran out of memory - shutting down
>  java.lang.OutOfMemoryError: Java heap space
>  \{{ at java.base/java.lang.StringLatin1.newString(StringLatin1.java:549)}}
>  \{{ at java.base/java.lang.StringBuilder.toString(StringBuilder.java:415)}}
>  \{{ at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.flushBlock(BoilerpipeHTMLContentHandler.java:341)}}
>  \{{ at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:198)}}
>  \{{ at org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
>  \{{ at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)}}
>  \{{ at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)}}
>  \{{ at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)}}
>  \{{ at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)}}
>  \{{ at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)}}
>  \{{ at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)}}
>  \{{ at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)}}
>  \{{ at org.apache.tika.parser.microsoft.TextCell.render(TextCell.java:34)}}
>  \{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processSheet(ExcelExtractor.java:609)}}
>  \{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord(ExcelExtractor.java:392)}}
>  \{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord(ExcelExtractor.java:343)}}
>  \{{ at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord(FormatTrackingHSSFListener.java:92)}}
>  \{{ at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord(HSSFRequest.java:109)}}
>  \{{ at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:179)}}
>  \{{ at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:136)}}
>  \{{ at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile(ExcelExtractor.java:319)}}
>  \{{ at org.apache.tika.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:170)}}
>  \{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:184)}}
>  \{{ at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)}}
>  \{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
>  \{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
>  \{{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}
>  \{{ at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)}}
>  agents process ran out of memory - shutting down
>  java.lang.OutOfMemoryError: Java heap space
>  \{{ at java.base/java.util.Arrays.copyOf(Arrays.java:3744)}}
>  \{{ at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:146)}}
>  \{{ at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:531)}}
>  \{{ at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:550)}}
>  \{{ at java.base/java.lang.StringBuilder.append(StringBuilder.java:171)}}
>  \{{ at java.base/java.util.regex.Matcher.appendReplacement(Matcher.java:1002)}}
>  \{{ at java.base/java.util.regex.Matcher.replaceAll(Matcher.java:1181)}}
>  \{{ at de.l3s.boilerpipe.util.UnicodeTokenizer.tokenize(UnicodeTokenizer.java:40)}}
>  \{{ at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.flushBlock(BoilerpipeHTMLContentHandler.java:296)}}
>  \{{ at de.l3s.boilerpipe.sax.CommonTagActions$3.end(CommonTagActions.java:143)}}
>  \{{ at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.endElement(BoilerpipeHTMLContentHandler.java:183)}}
>  \{{ at org.apache.tika.parser.html.BoilerpipeContentHandler.endElement(BoilerpipeContentHandler.java:175)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)}}
>  \{{ at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHandler.java:256)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)}}
>  \{{ at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)}}
>  \{{ at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.java:273)}}
>  \{{ at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:224)}}
>  \{{ at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:109)}}
>  \{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
>  \{{ at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)}}
>  \{{ at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)}}
>  \{{ at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)}}
>  \{{ at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)}}
>  \{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)}}
>  \{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)}}
>  \{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)}}
>  \{{ at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)}}
>  \{{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)}}
>  \{{ at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)}}
>  \{{ at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:448)}}
>  13:40:33.995 [Worker thread '42'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  [Thread-475] INFO org.eclipse.jetty.server.ServerConnector - Stopped ServerConnector@5d235104\{HTTP/1.1}{0.0.0.0:8345}
>  {{[Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@6105f8a3\{/mcf-api-service,[file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-9896962439762567079.dir/webapp/,UNAVAILABLE|file:///tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-9896962439762567079.dir/webapp/,UNAVAILABLE]}{/opt/manifoldcf/manifoldcf_single/././web/war/mcf-api-service.war}
>  
>  }}
>  {{[Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@12365c88\{/mcf-authority-service,[file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-3954308360064638561.dir/webapp/,UNAVAILABLE|file:///tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-3954308360064638561.dir/webapp/,UNAVAILABLE]}
>  \{/opt/manifoldcf/manifoldcf_single/././web/war/mcf-authority-service.war}
>  
>  }}
>  
>   
>  
>  Follow-up: When these issues occur, the jvm runs out of space:
>  
>  13:39:39.856 [Worker thread '49'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:39.970 [Worker thread '43'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:40.415 [Worker thread '34'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:40.469 [Worker thread '1'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:43.739 [Worker thread '32'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:44.697 [Worker thread '43'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:45.756 [Worker thread '33'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:45.775 [Worker thread '36'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:46.751 [Worker thread '35'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:46.753 [Worker thread '40'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:47.536 [Worker thread '45'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:48.734 [Worker thread '44'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:50.922 [Worker thread '30'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:39:54.930 [Worker thread '28'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  13:40:33.660 [Worker thread '29'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  agents process ran out of memory - shutting down
>  java.lang.OutOfMemoryError: Java heap space
>  at java.base/java.lang.StringLatin1.newString(StringLatin1.java:549)
>  at java.base/java.lang.StringBuilder.toString(StringBuilder.java:415)
>  at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.flushBlock(BoilerpipeHTMLContentHandler.java:341)
>  at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:198)
>  at org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)
>  at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>  at org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
>  at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>  at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>  at org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>  at org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
>  at org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
>  at org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
>  at org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
>  at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)
>  at org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)
>  at org.apache.tika.parser.microsoft.TextCell.render(TextCell.java:34)
>  at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processSheet(ExcelExtractor.java:609)
>  at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.internalProcessRecord(ExcelExtractor.java:392)
>  at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processRecord(ExcelExtractor.java:343)
>  at org.apache.poi.hssf.eventusermodel.FormatTrackingHSSFListener.processRecord(FormatTrackingHSSFListener.java:92)
>  at org.apache.poi.hssf.eventusermodel.HSSFRequest.processRecord(HSSFRequest.java:109)
>  at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:179)
>  at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:136)
>  at org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile(ExcelExtractor.java:319)
>  at org.apache.tika.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:170)
>  at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:184)
>  at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>  at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)
>  agents process ran out of memory - shutting down
>  java.lang.OutOfMemoryError: Java heap space
>  at java.base/java.util.Arrays.copyOf(Arrays.java:3744)
>  at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:146)
>  at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:531)
>  at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:550)
>  at java.base/java.lang.StringBuilder.append(StringBuilder.java:171)
>  at java.base/java.util.regex.Matcher.appendReplacement(Matcher.java:1002)
>  at java.base/java.util.regex.Matcher.replaceAll(Matcher.java:1181)
>  at de.l3s.boilerpipe.util.UnicodeTokenizer.tokenize(UnicodeTokenizer.java:40)
>  at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.flushBlock(BoilerpipeHTMLContentHandler.java:296)
>  at de.l3s.boilerpipe.sax.CommonTagActions$3.end(CommonTagActions.java:143)
>  at de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.endElement(BoilerpipeHTMLContentHandler.java:183)
>  at org.apache.tika.parser.html.BoilerpipeContentHandler.endElement(BoilerpipeContentHandler.java:175)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHandler.java:256)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136)
>  at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.java:273)
>  at org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:224)
>  at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:109)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>  at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>  at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)
>  at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)
>  at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
>  at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
>  at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)
>  at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
>  at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
>  at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
>  at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:448)
>  13:40:33.995 [Worker thread '42'] WARN org.apache.manifoldcf.jobs - Service interruption
reported for job 1532551209410 connection 'file': IO exception: null
>  [Thread-475] INFO org.eclipse.jetty.server.ServerConnector - Stopped ServerConnector@5d235104\{HTTP/1.1}{0.0.0.0:8345}
>  [Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@6105f8a3{/mcf-api-service,[file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-9896962439762567079.dir/webapp/,UNAVAILABLE|file:///tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-9896962439762567079.dir/webapp/,UNAVAILABLE]}\{/opt/manifoldcf/manifoldcf_single/././web/war/mcf-api-service.war}
> [Thread-475] INFO org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.w.WebAppContext@12365c88{/mcf-authority-service,[file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-3954308360064638561.dir/webapp/,UNAVAILABLE|file:///tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-3954308360064638561.dir/webapp/,UNAVAILABLE]}
> {/opt/manifoldcf/manifoldcf_single/././web/war/mcf-authority-service.war}
>  This occurs when ES Connector has this issue:
> |07-26-2018 19:34:25.356|Indexation (ES)|file:/var/manifoldcf/corpus/000640.html|CLIENTPROTOCOLEXCEPTION|46190|9|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Mime
View raw message