manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Query about content of the file
Date Tue, 22 Jul 2014 17:37:52 GMT
Hi Ameya,

This is not a ManifoldCF question.  Please either increase your Solr
memory, or post to the Solr list.

Karl


On Tue, Jul 22, 2014 at 1:33 PM, Ameya Aware <ameya.aware@gmail.com> wrote:

> So what could be the fix for this?
>
>
> On Tue, Jul 22, 2014 at 12:25 PM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Thanks for the suggestion, Peter.  However the memory error is occurring
>> on solr, not mcf.
>>
>>
>> Karl
>>
>> Sent from my Windows Phone
>> ------------------------------
>> From: Peter Choe
>> Sent: 7/22/2014 12:23 PM
>> To: user@manifoldcf.apache.org
>> Subject: RE: Query about content of the file
>>
>>   You can modify the options.env.unix or win to set the heap size.
>>
>>
>>
>> The default setting is not high enough.
>>
>>
>>
>> Peter Choe
>>
>>
>>
>> *From:* Ameya Aware [mailto:ameya.aware@gmail.com]
>> *Sent:* Tuesday, July 22, 2014 12:04 PM
>> *To:* user@manifoldcf.apache.org
>> *Subject:* Re: Query about content of the file
>>
>>
>>
>> Hi Karl,
>>
>>
>>
>> I was getting many TikkaException errors at first, so i ignored them by
>> setting that field in solrconfig.xml. After that crawling happened smoothly.
>>
>>
>>
>> But now i ran into java heap space issue. Please see below log.
>>
>>
>>
>>
>>
>> ERROR - 2014-07-22 11:38:59.370; org.apache.solr.common.SolrException;
>> null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:790)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:439)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>>
>>             at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>>
>>             at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
>>
>>             at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>>
>>             at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
>>
>>             at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>>
>>             at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
>>
>>             at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
>>
>>             at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>>
>>             at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
>>
>>             at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
>>
>>             at org.eclipse.jetty.server.Server.handle(Server.java:368)
>>
>>             at
>> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
>>
>>             at
>> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
>>
>>             at
>> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
>>
>>             at
>> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
>>
>>             at
>> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:636)
>>
>>             at
>> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
>>
>>             at
>> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
>>
>>             at
>> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
>>
>>             at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
>>
>>             at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
>>
>>             at java.lang.Thread.run(Unknown Source)
>>
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:567)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:646)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:240)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:153)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:409)
>>
>>             at
>> org.apache.solr.update.TransactionLog.write(TransactionLog.java:353)
>>
>>             at org.apache.solr.update.UpdateLog.add(UpdateLog.java:397)
>>
>>             at org.apache.solr.update.UpdateLog.add(UpdateLog.java:382)
>>
>>             at
>> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:255)
>>
>>             at
>> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
>>
>>             at
>> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>>
>>             at
>> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>>
>>             at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:704)
>>
>>             at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:858)
>>
>>             at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:557)
>>
>>             at
>> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
>>
>>             at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:121)
>>
>>             at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:126)
>>
>>             at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
>>
>>             at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>>
>>             at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>>
>>             at
>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
>>
>>             at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>>
>>             at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>>
>>             at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
>>
>>             at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>>
>>             at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
>>
>>
>>
>> WARN  - 2014-07-22 11:38:59.479;
>> org.eclipse.jetty.servlet.ServletHandler; Error for
>> /solr/collection1/update/extract
>>
>> java.lang.OutOfMemoryError: Java heap space
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:567)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:646)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:240)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:153)
>>
>>             at
>> org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:409)
>>
>>             at
>> org.apache.solr.update.TransactionLog.write(TransactionLog.java:353)
>>
>>             at org.apache.solr.update.UpdateLog.add(UpdateLog.java:397)
>>
>>             at org.apache.solr.update.UpdateLog.add(UpdateLog.java:382)
>>
>>             at
>> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:255)
>>
>>             at
>> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
>>
>>             at
>> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>>
>>             at
>> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>>
>>             at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:704)
>>
>>             at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:858)
>>
>>             at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:557)
>>
>>             at
>> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
>>
>>             at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:121)
>>
>>             at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:126)
>>
>>             at
>> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
>>
>>             at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>>
>>             at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>>
>>             at
>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
>>
>>             at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
>>
>>             at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
>>
>>             at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>>
>>             at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>>
>>             at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
>>
>>             at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>>
>>             at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
>>
>>
>>
>>
>>
>> Can you advice me how can i fix this.
>>
>>
>>
>>
>>
>> Thanks,
>> Ameya
>>
>>
>>
>> On Mon, Jul 21, 2014 at 7:11 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>> Hi Ameya,
>>
>> We've not under the most wild circumstances ever considered the need to
>> prevent the actual content of a file from being indexed.
>>
>> If you are indexing into Solr, and the thing that is failing is content
>> extraction (and it is aborting your job), then please be aware there is a
>> way in Solr to ignore this error.  Please search this list and you will see
>> it posted numerous times.
>>
>> Karl
>>
>>
>>
>> On Mon, Jul 21, 2014 at 10:51 AM, Ameya Aware <ameya.aware@gmail.com>
>> wrote:
>>
>> Hi
>>
>>
>>
>> How can i not send content of the file to Solr?
>>
>>
>>
>> I do not want the content of the file being sent to Solr and getting
>> indexed because indexing the content is causing lots of errors.
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Ameya
>>
>>
>>
>>
>>
>
>

Mime
View raw message