lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: ExtractingRequestHandler causes Out of Memory Error
Date Thu, 27 Sep 2012 21:46:18 GMT
Please try to increase -Xmx and see how much RAM you need for it to succeed.

I believe it is simply a case where this particular file needs double memory (480Mb) to parse
and you have only allocated 1Gb (which is not particularly much). Perhaps the code could be
optimized to avoid the Arrays.copyOf() call..

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

27. sep. 2012 kl. 11:22 skrev Shigeki Kobayashi <shigeki.kobayashi3@g.softbank.co.jp>:

> Hi guys,
> 
> 
> I use Manifold CF to crawl files in Windows file server and index them to
> Solr using Extracting Request Handler.
> Most of the documents are succesfully indexed but some are failed and Out
> of Memory Error occurs in Solr, so I need some advice.
> 
> Those failed files are not so big and they are a csv file of 240MB and a
> text file of 170MB.
> 
> Here is environment and machine spec:
> Solr 3.6 (also Solr4.0Beta)
> Tomcat 6.0
> CentOS 5.6
> java version 1.6.0_23
> HDD 60GB
> MEM 2GB
> JVM Heap: -Xmx1024m -Xms1024m
> 
> I feel there is enough memory that Solr should be able to extract and index
> file content.
> 
> Here is a Solr log below:
> ------
> [solr.servlet.SolrDispatchFilter]-[http-8080-8]-:java.lang.OutOfMemoryError:
> Java heap space
>        at java.util.Arrays.copyOf(Arrays.java:2882)
>        at
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>        at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
>        at java.lang.StringBuilder.append(StringBuilder.java:189)
>        at
> org.apache.solr.handler.extraction.SolrContentHandler.characters(SolrContentHandler.java:293)
>        at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>        at
> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270)
>        at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>        at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>        at
> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146)
>        at
> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46)
>        at
> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82)
>        at
> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140)
>        at
> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287)
>        at
> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:268)
>        at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:134)
>        at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>        at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>        at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>        at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:227)
>        at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
>        at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>        at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
>        at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
> filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:122)
>        at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>        at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>        at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> 
> -----
> 
> Anyone has any ideas?
> 
> Regards,
> 
> Shigeki


Mime
View raw message