manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject ***UNCHECKED*** Re: Out of memory, one file bug i think
Date Wed, 25 Jul 2018 17:14:59 GMT
It looks like you are still running out of memory.  I would love to know
what document it was that doing that.  I suspect it is very large already,
and for some reason it cannot be streamed.

Karl


On Wed, Jul 25, 2018 at 1:13 PM Karl Wright <daddywri@gmail.com> wrote:

> Hi Maxence,
>
> The second exception is occurring because processing is still occurring
> while the JVM is shutting down; it can be ignored.
>
> Karl
>
>
> On Wed, Jul 25, 2018 at 1:01 PM msaunier <msaunier@citya.com> wrote:
>
>> Hi Karl,
>>
>>
>>
>> I have add the snapshot and I’m spam with this error :
>>
>>
>>
>> FATAL 2018-07-25T16:43:04,599 (Worker thread '0') - Error tossed:
>> org/apache/commons/compress/utils/InputStreamStatistics
>>
>> java.lang.NoClassDefFoundError:
>> org/apache/commons/compress/utils/InputStreamStatistics
>>
>>         at
>> org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.<init>(ZipArchiveThresholdInputStream.java:62)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.openxml4j.util.ZipSecureFile.getInputStream(ZipSecureFile.java:147)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.openxml4j.util.ZipSecureFile.getInputStream(ZipSecureFile.java:34)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.openxml4j.util.ZipFileZipEntrySource.getInputStream(ZipFileZipEntrySource.java:66)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:255)
>> ~[?:?]
>>
>>         at
>> org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:725) ~[?:?]
>>
>>         at
>> org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:238) ~[?:?]
>>
>>         at
>> org.apache.tika.parser.pkg.ZipContainerDetector.detectOPCBased(ZipContainerDetector.java:197)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:127)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:88)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:84)
>> ~[?:?]
>>
>>         at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:116)
>> ~[?:?]
>>
>>         at
>> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)
>> ~[?:?]
>>
>>         at
>> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)
>> ~[?:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
>> ~[mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
>> ~[mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)
>> ~[mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
>> ~[mcf-agents.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
>> ~[mcf-pull-agent.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
>> ~[mcf-pull-agent.jar:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
>> ~[?:?]
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
>> [mcf-pull-agent.jar:?]
>>
>>
>>
>> Maxence,
>>
>>
>>
>>
>>
>> *De :* Karl Wright [mailto:daddywri@gmail.com]
>> *Envoyé :* mercredi 25 juillet 2018 13:12
>> *À :* user@manifoldcf.apache.org
>> *Objet :* Re: Out of memory, one file bug i think
>>
>>
>>
>> Hi Maxence,
>>
>>
>>
>> Tomorrow (7/26) the POI project will be delivering a nightly build which
>> should repair the Class Not Found exceptions.  You will need to download it
>> here:
>>
>>
>> https://builds.apache.org/view/P/view/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/
>>
>>
>>
>> ... and replace all poi jars with the corresponding ones from the binary
>> distribution.  I believe the poi jars are all in connector-common-lib.  Be
>> sure to delete the old ones (or move them somewhere else) first.
>>
>>
>>
>> I don't know whether this will fix your out of memory problem however.
>> Please let me know what's still not working and I can take it from there.
>>
>>
>>
>> Karl
>>
>>
>>
>>
>>
>> On Wed, Jul 25, 2018 at 6:03 AM Karl Wright <daddywri@gmail.com> wrote:
>>
>> Out of memory errors are fatal, I'm afraid, because they corrupt not only
>> the document in question but all others being processed at the same time.
>> So those cannot be ignored.
>>
>>
>>
>> Tika should ignore documents that it cannot process, however, and that is
>> a great enhancement request for them.
>>
>>
>>
>> Karl
>>
>>
>>
>>
>>
>> On Wed, Jul 25, 2018 at 3:39 AM msaunier <msaunier@citya.com> wrote:
>>
>> Hi Karl,
>>
>>
>>
>> Okay. So today, I'm going to force ManifoldCF to run so that only the
>> documents are left behind.
>>
>> In the future, could I ignore these mistakes? Because it makes the
>> application crash, and in production it is not terrible as behavior.
>>
>>
>>
>> Thanks
>>
>> Maxence,
>>
>>
>>
>>
>>
>> *De :* Karl Wright [mailto:daddywri@gmail.com]
>> *Envoyé :* mardi 24 juillet 2018 17:53
>> *À :* user@manifoldcf.apache.org
>> *Objet :* Re: Out of memory, one file bug i think
>>
>>
>>
>> The problem isn't with images in general; it's with certain kinds of
>> images.  There are optional dependencies in Tika for some kinds of images
>> that we cannot include in the MCF distribution because of licensing
>> problems.  I don't know which kinds these are but apparently you are trying
>> to index some of them.
>>
>> You will need to find and download the right jar and put it in the
>> connector-common-lib folder for this to work.
>>
>>
>>
>> Karl
>>
>>
>>
>>
>>
>> On Tue, Jul 24, 2018 at 11:36 AM msaunier <msaunier@citya.com> wrote:
>>
>> On other crawl I extract images with sames parameters and I not have
>> problems with images. They are index without errors. Images are necessary
>> for this job. I try to recreate my job and test.
>>
>>
>>
>> Thanks,
>>
>> Maxence,
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *De :* Karl Wright [mailto:daddywri@gmail.com]
>> *Envoyé :* mardi 24 juillet 2018 17:32
>> *À :* user@manifoldcf.apache.org
>> *Objet :* Re: Out of memory, one file bug i think
>>
>>
>>
>> " java.lang.NoSuchMethodException:
>> org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTPictureBaseImpl.<init>(org.apache.xmlbeans.SchemaType,
>> boolean)"
>>
>>
>>
>> This exception is occurring because you are trying to extract content
>> from an image.  In order for this to work you need a jar that isn't
>> supplied with Tika for licensing reasons.  Can you exclude images from your
>> crawl?
>>
>>
>>
>> Karl
>>
>>
>>
>>
>>
>> On Tue, Jul 24, 2018 at 10:32 AM msaunier <msaunier@citya.com> wrote:
>>
>> Hi Karl,
>>
>>
>>
>> With just connectors in debug I have that informations:
>>
>>
>>
>> [Thread-269948] INFO org.apache.zookeeper.ZooKeeper - Initiating client
>> connection, connectString=kemp-formation-solr:2181 sessionTimeout=60000
>> watcher=org.apache.solr.common.cloud.SolrZkClient$3@3c351b22
>>
>> [Thread-269948-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>>
>> [Thread-269948-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session
>>
>> [Thread-269948-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Session establishment complete on server
>> kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid =
>> 0xff00000201970049, negotiated timeout = 40000
>>
>> [Thread-269948] INFO org.apache.solr.common.cloud.ZkStateReader - Updated
>> live nodes from ZooKeeper... (0) -> (2)
>>
>> [Thread-269948] INFO
>> org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider - Cluster at
>> kemp-formation-solr:2181 ready
>>
>> java.lang.NoSuchMethodException:
>> org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTPictureBaseImpl.<init>(org.apache.xmlbeans.SchemaType,
>> boolean)
>>
>>         at java.lang.Class.getConstructor0(Class.java:3082)
>>
>>         at java.lang.Class.getDeclaredConstructor(Class.java:2178)
>>
>>         at
>> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.getJavaImplConstructor2(SchemaTypeImpl.java:1817)
>>
>>         at
>> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createUnattachedSubclass(SchemaTypeImpl.java:1961)
>>
>>         at
>> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createUnattachedNode(SchemaTypeImpl.java:1950)
>>
>>         at
>> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:1051)
>>
>>         at
>> org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:938)
>>
>>         at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1675)
>>
>>         at org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2659)
>>
>>         at org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2652)
>>
>>         at
>> org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:995)
>>
>>         at
>> org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2904)
>>
>>         at
>> org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:162)
>>
>>         at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:169)
>>
>>         at
>> org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:112)
>>
>>         at
>> org.apache.poi.xwpf.extractor.XWPFWordExtractor.<init>(XWPFWordExtractor.java:60)
>>
>>         at
>> org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:243)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:105)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106)
>>
>>         at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>>
>>         at
>> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>>
>>         at
>> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
>>
>>         at
>> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)
>>
>>         at
>> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235)
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226)
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077)
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708)
>>
>>         at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
>>
>>         at
>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
>>
>> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28024ms for sessionid 0x100000050ae004d
>>
>> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28024ms for sessionid 0x100000050ae004d, closing socket
>> connection and attempting reconnect
>>
>> [zkCallback-16-thread-2] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Watcher
>> org.apache.solr.common.cloud.ConnectionManager@5382340 name:
>> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent
>> state:Disconnected type:None path:null path: null type: None
>>
>> [zkCallback-16-thread-2] WARN
>> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected
>>
>> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>>
>> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at
>> org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:737)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:784)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>>
>>         at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>>
>>         at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>>
>>         at
>> org.apache.manifoldcf.crawler.jobs.JobManager.getJobsReadyForInactivity(JobManager.java:8024)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:76)
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at
>> org.postgresql.jdbc.PgConnection.prepareStatement(PgConnection.java:1200)
>>
>>         at
>> org.postgresql.jdbc.PgConnection.prepareStatement(PgConnection.java:1583)
>>
>>         at
>> org.postgresql.jdbc.PgConnection.prepareStatement(PgConnection.java:372)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database.execute(Database.java:896)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696)
>>
>> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Session establishment complete on server
>> kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid =
>> 0x100000050ae004d, negotiated timeout = 40000
>>
>> [Thread-490] INFO org.eclipse.jetty.server.ServerConnector - Stopped
>> ServerConnector@2a640157{HTTP/1.1}{0.0.0.0:8345}
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at java.util.HashMap.resize(HashMap.java:704)
>>
>>         at java.util.HashMap.putVal(HashMap.java:629)
>>
>>         at java.util.HashMap.put(HashMap.java:612)
>>
>>         at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:154)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>>
>>         at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>>
>>         at
>> org.apache.manifoldcf.crawler.jobs.JobManager.processParentHashSet(JobManager.java:5642)
>>
>>         at
>> org.apache.manifoldcf.crawler.jobs.JobManager.calculateAffectedRestoreCarrydownChildren(JobManager.java:5581)
>>
>>         at
>> org.apache.manifoldcf.crawler.jobs.JobManager.finishDocuments(JobManager.java:5453)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:570)
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at java.util.Arrays.copyOf(Arrays.java:3308)
>>
>>         at java.util.BitSet.ensureCapacity(BitSet.java:337)
>>
>>         at java.util.BitSet.expandTo(BitSet.java:352)
>>
>>         at java.util.BitSet.set(BitSet.java:447)
>>
>>         at
>> de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:267)
>>
>>         at
>> org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
>>
>>         at
>> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)
>>
>>         at
>> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator$SheetTextAsHTML.cell(XSSFExcelExtractorDecorator.java:431)
>>
>>         at
>> org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.endElement(XSSFSheetXMLHandler.java:380)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator$XSSFSheetInterestingPartsCapturer.endElement(XSSFExcelExtractorDecorator.java:520)
>>
>>         at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown
>> Source)
>>
>>         at
>> org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown
>> Source)
>>
>>         at
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
>> Source)
>>
>>         at
>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
>> Source)
>>
>>         at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
>> Source)
>>
>>         at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
>> Source)
>>
>>         at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>>
>>         at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown
>> Source)
>>
>>         at
>> org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processSheet(XSSFExcelExtractorDecorator.java:344)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:167)
>>
>>         at
>> org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135)
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x100000050ae004e closed
>>
>> [Thread-257943-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x100000050ae004e
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x100000050ae004d closed
>>
>> [Thread-35854-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x100000050ae004d
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x2000000b80d004a closed
>>
>> [Thread-8765-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x2000000b80d004a
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x2000000b80d004b closed
>>
>> [Thread-35853-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x2000000b80d004b
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0xff00000201970046 closed
>>
>> [Thread-6991-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0xff00000201970046
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x100000050ae004c closed
>>
>> [Thread-8699-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x100000050ae004c
>>
>> [Thread-490] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Stopped
>> o.e.j.w.WebAppContext@44d52de2{/mcf-api-service,file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-559052738855414857.dir/webapp/,UNAVAILABLE}{/opt/manifoldcf-trunk/bin/./../web-proprietary/war/mcf-api-service.war}
>>
>> [Thread-490] INFO org.eclipse.jetty.server.handler.ContextHandler -
>> Stopped
>> o.e.j.w.WebAppContext@60410cd{/mcf-authority-service,file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-927770358411352606.dir/webapp/,UNAVAILABLE}{/opt/manifoldcf-trunk/bin/./../web-proprietary/war/mcf-authority-service.war}
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x2000000b80d004c closed
>>
>> [Thread-262666-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x2000000b80d004c
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0xff00000201970048 closed
>>
>> [Thread-244171-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0xff00000201970048
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0xff00000201970049 closed
>>
>> [Thread-269948-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0xff00000201970049
>>
>>
>>
>> I have unactivate history to gain performances. So, can I find the last
>> file with SQL request?
>>
>>
>>
>> Maxence,
>>
>>
>>
>> *De :* Karl Wright [mailto:daddywri@gmail.com]
>> *Envoyé :* mardi 24 juillet 2018 16:04
>> *À :* user@manifoldcf.apache.org
>> *Objet :* Re: Out of memory, one file bug i think
>>
>>
>>
>> Hi Maxence,
>>
>>
>>
>> You would want to turn on connector debugging INSTEAD of the debugging
>> you've turned on, which is very noisy and not helpful.
>>
>>
>>
>> In global properties: org.apache.manifoldcf.connectors value DEBUG
>>
>>
>>
>> Karl
>>
>>
>>
>>
>>
>> On Tue, Jul 24, 2018 at 9:12 AM msaunier <msaunier@citya.com> wrote:
>>
>> With debug:
>>
>>
>>
>> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28034ms for sessionid 0x100000050ae0049
>>
>> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28034ms for sessionid 0x100000050ae0049, closing socket
>> connection and attempting reconnect
>>
>> [Thread-31532-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 27708ms for sessionid 0xff00000201970044
>>
>> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 27737ms for sessionid 0xff00000201970043
>>
>> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 27737ms for sessionid 0xff00000201970043, closing socket
>> connection and attempting reconnect
>>
>> [Thread-31551-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28316ms for sessionid 0x100000050ae004b
>>
>> [Thread-7602-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28394ms for sessionid 0x2000000b80d0047
>>
>> [Thread-7602-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28394ms for sessionid 0x2000000b80d0047, closing socket
>> connection and attempting reconnect
>>
>> [Thread-31532-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 27708ms for sessionid 0xff00000201970044, closing socket
>> connection and attempting reconnect
>>
>> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>>
>> agents process ran out of memory - shutting down
>>
>> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session
>>
>> [Thread-7538-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 36805ms for sessionid 0x2000000b80d0046
>>
>> [Thread-7538-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 36805ms for sessionid 0x2000000b80d0046, closing socket
>> connection and attempting reconnect
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at java.lang.StringBuilder.toString(StringBuilder.java:407)
>>
>>         at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.readSharedData(CacheManager.java:849)
>>
>>         at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.hasExpired(CacheManager.java:483)
>>
>>         at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.lookupObject(CacheManager.java:454)
>>
>>         at
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:131)
>>
>>         at
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>>
>>         at
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:862)
>>
>>         at
>> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:236)
>>
>>         at
>> org.apache.manifoldcf.crawler.jobs.Jobs.deletingJobsPresent(Jobs.java:3133)
>>
>>         at
>> org.apache.manifoldcf.crawler.jobs.JobManager.getNextDeletableDocuments(JobManager.java:1862)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.DocumentDeleteStufferThread.run(DocumentDeleteStufferThread.java:108)
>>
>> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>>
>> agents process ran out of memory - shutting down
>>
>> [Thread-7574-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 27763ms for sessionid 0x100000050ae004a
>>
>> [Thread-7574-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 27763ms for sessionid 0x100000050ae004a, closing socket
>> connection and attempting reconnect
>>
>> [zkCallback-3-thread-7] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Watcher
>> org.apache.solr.common.cloud.ConnectionManager@7a5c701e name:
>> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent
>> state:Disconnected type:None path:null path: null type: None
>>
>> [zkCallback-3-thread-7] WARN
>> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected
>>
>> [Thread-31551-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard
>> from server in 28316ms for sessionid 0x100000050ae004b, closing socket
>> connection and attempting reconnect
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Socket connection established to
>> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session
>>
>> [zkCallback-11-thread-5] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Watcher
>> org.apache.solr.common.cloud.ConnectionManager@53181a58 name:
>> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent
>> state:Disconnected type:None path:null path: null type: None
>>
>> [zkCallback-11-thread-5] WARN
>> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected
>>
>> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service,
>> session 0xff00000201970043 has expired
>>
>> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service,
>> session 0xff00000201970043 has expired, closing socket connection
>>
>> [Thread-7573-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0xff00000201970043
>>
>> [zkCallback-11-thread-2] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Watcher
>> org.apache.solr.common.cloud.ConnectionManager@53181a58 name:
>> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent
>> state:Expired type:None path:null path: null type: None
>>
>> [zkCallback-11-thread-2] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Our previous ZooKeeper
>> session was expired. Attempting to reconnect to recover relationship with
>> ZooKeeper...
>>
>> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] WARN
>> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service,
>> session 0x100000050ae0049 has expired
>>
>> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO
>> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service,
>> session 0x100000050ae0049 has expired, closing socket connection
>>
>> [zkCallback-11-thread-2] WARN
>> org.apache.solr.common.cloud.DefaultConnectionStrategy - Connection expired
>> - starting a new one...
>>
>> [zkCallback-11-thread-2] INFO org.apache.zookeeper.ZooKeeper - Initiating
>> client connection, connectString=kemp-formation-solr:2181
>> sessionTimeout=60000
>> watcher=org.apache.solr.common.cloud.ConnectionManager@53181a58
>>
>> [Thread-5234-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x100000050ae0049
>>
>> [zkCallback-3-thread-4] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Watcher
>> org.apache.solr.common.cloud.ConnectionManager@7a5c701e name:
>> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent
>> state:Expired type:None path:null path: null type: None
>>
>> [zkCallback-3-thread-4] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Our previous ZooKeeper
>> session was expired. Attempting to reconnect to recover relationship with
>> ZooKeeper...
>>
>> [zkCallback-3-thread-4] WARN
>> org.apache.solr.common.cloud.DefaultConnectionStrategy - Connection expired
>> - starting a new one...
>>
>> [zkCallback-3-thread-4] INFO org.apache.zookeeper.ZooKeeper - Initiating
>> client connection, connectString=kemp-formation-solr:2181
>> sessionTimeout=60000
>> watcher=org.apache.solr.common.cloud.ConnectionManager@7a5c701e
>>
>> [zkCallback-3-thread-4-SendThread(kemp-formation-solr.citya.local:2181)]
>> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>>
>> [zkCallback-11-thread-2-SendThread(kemp-formation-solr.citya.local:2181)]
>> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server
>> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>>
>> [zkCallback-3-thread-4-SendThread(kemp-formation-solr.citya.local:2181)]
>> INFO org.apache.zookeeper.ClientCnxn - Socket connection established to
>> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session
>>
>> [zkCallback-11-thread-2-SendThread(kemp-formation-solr.citya.local:2181)]
>> INFO org.apache.zookeeper.ClientCnxn - Socket connection established to
>> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session
>>
>> [Thread-490] INFO org.eclipse.jetty.server.ServerConnector - Stopped
>> ServerConnector@2a640157{HTTP/1.1}{0.0.0.0:8345}
>>
>> [zkCallback-3-thread-4-SendThread(kemp-formation-solr.citya.local:2181)]
>> INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on
>> server kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid =
>> 0x2000000b80d0049, negotiated timeout = 40000
>>
>> [zkCallback-11-thread-2-SendThread(kemp-formation-solr.citya.local:2181)]
>> INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on
>> server kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid =
>> 0xff00000201970045, negotiated timeout = 40000
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at java.util.HashMap.newNode(HashMap.java:1747)
>>
>>         at java.util.HashMap.putVal(HashMap.java:631)
>>
>>         at java.util.HashMap.put(HashMap.java:612)
>>
>>         at jcifs.util.transport.Transport.sendrecv(Transport.java:66)
>>
>>         at jcifs.smb.SmbTransport.send(SmbTransport.java:661)
>>
>>         at jcifs.smb.SmbSession.send(SmbSession.java:238)
>>
>>         at jcifs.smb.SmbTree.send(SmbTree.java:119)
>>
>>         at jcifs.smb.SmbFile.send(SmbFile.java:776)
>>
>>         at
>> jcifs.smb.SmbFileInputStream.readDirect(SmbFileInputStream.java:181)
>>
>>         at jcifs.smb.SmbFileInputStream.read(SmbFileInputStream.java:142)
>>
>>         at
>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:903)
>>
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
>>
>> [zkCallback-11-thread-2] INFO
>> org.apache.solr.common.cloud.ConnectionManager - Connection with ZooKeeper
>> reestablished.
>>
>> [zkCallback-3-thread-4] INFO
>> org.apache.solr.common.cloud.ConnectionManager - Connection with ZooKeeper
>> reestablished.
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>> [zkCallback-11-thread-2] INFO
>> org.apache.solr.common.cloud.DefaultConnectionStrategy - Reconnected to
>> ZooKeeper
>>
>> [zkCallback-11-thread-2] INFO
>> org.apache.solr.common.cloud.ConnectionManager - Connected:true
>>
>> [zkCallback-3-thread-4] INFO
>> org.apache.solr.common.cloud.DefaultConnectionStrategy - Reconnected to
>> ZooKeeper
>>
>> [zkCallback-3-thread-4] INFO
>> org.apache.solr.common.cloud.ConnectionManager - Connected:true
>>
>> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session:
>> 0x2000000b80d0046 closed
>>
>> [zkCallback-21-thread-2] WARN
>> org.apache.solr.common.cloud.ConnectionManager - Watcher
>> org.apache.solr.common.cloud.ConnectionManager@381a7557 name:
>> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent
>> state:Disconnected type:None path:null path: null type: None
>>
>> [zkCallback-21-thread-2] WARN
>> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected
>>
>> [Thread-7538-EventThread] INFO org.apache.zookeeper.ClientCnxn -
>> EventThread shut down for session: 0x2000000b80d0046
>>
>> agents process ran out of memory - shutting down
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>
>>         at java.util.regex.Matcher.<init>(Matcher.java:225)
>>
>>         at java.util.regex.Pattern.matcher(Pattern.java:1093)
>>
>>         at
>> de.l3s.boilerpipe.util.UnicodeTokenizer.tokenize(UnicodeTokenizer.java:40)
>>
>>         at
>> de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.flushBlock(BoilerpipeHTMLContentHandler.java:296)
>>
>>         at
>> de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:198)
>>
>>         at
>> org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
>>
>>         at
>> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
>>
>>         at
>> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.xpath.MatchingContentHandler.characters(MatchingContentHandler.java:85)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>         at
>> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>>
>>
>>
>>

Mime
View raw message