lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Giovanni Fernandez-Kincade <gfernandez-kinc...@capitaliq.com>
Subject RE: Solr Timeouts
Date Mon, 05 Oct 2009 18:11:05 GMT
I just grabbed another stack trace for a thread that has been similarly blocking for over an
hour. Notice that there is no Commit in this one:

http-8080-Processor67 [RUNNABLE] CPU time: 1:02:05
org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos)
org.apache.lucene.index.SegmentTermEnum.next()
org.apache.lucene.index.SegmentTermEnum.scanTo(Term)
org.apache.lucene.index.TermInfosReader.get(Term, boolean)
org.apache.lucene.index.TermInfosReader.get(Term)
org.apache.lucene.index.SegmentTermDocs.seek(Term)
org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int)
org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos)
org.apache.lucene.index.IndexWriter.applyDeletes()
org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean)
org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean)
org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean)
org.apache.lucene.index.IndexWriter.updateDocument(Term, Document, Analyzer)
org.apache.lucene.index.IndexWriter.updateDocument(Term, Document)
org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand)
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(AddUpdateCommand)
org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(SolrContentHandler, AddUpdateCommand)
org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(SolrContentHandler)
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(SolrQueryRequest, SolrQueryResponse,
ContentStream)
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse)
org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse)
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest,
SolrQueryResponse)
org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse)
org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler,
SolrQueryRequest, SolrQueryResponse)
org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse, FilterChain)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse)
org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream)
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection,
Object[])
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[])
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[])
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run()
java.lang.Thread.run()


-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: Monday, October 05, 2009 1:18 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Timeouts

OK... next step is to verify that SolrCell doesn't have a bug that
causes it to commit.
I'll try and verify today unless someone else beats me to it.

-Yonik
http://www.lucidimagination.com

On Mon, Oct 5, 2009 at 1:04 PM, Giovanni Fernandez-Kincade
<gfernandez-kincade@capitaliq.com> wrote:
> I'm fairly certain that all of the indexing jobs are calling SOLR with commit=false.
They all construct the indexing URLs using a CLR function I wrote, which takes in a Commit
parameter, which is always set to false.
>
> Also, I don't see any calls to commit in the Tomcat logs (whereas normally when I make
a commit call I do).
>
> This suggests that Solr is doing it automatically, but the extract handler doesn't seem
to be the problem:
>  <requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler"
startup="lazy">
>    <lst name="defaults">
>      <str name="uprefix">ignored_</str>
>      <str name="map.content">fileData</str>
>    </lst>
>  </requestHandler>
>
>
> There is no external config file specified, and I don't see anything about commits here.
>
> I've tried setting up more detailed indexer logging but haven't been able to get it to
work:
> <infoStream file="c:\solr\indexer.log">true</infoStream>
>
> I tried relative and absolute paths, but no dice so far.
>
> Any other ideas?
>
> -Gio.
>
> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
> Sent: Monday, October 05, 2009 12:52 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Timeouts
>
>> This is what one of my SOLR requests look like:
>>
>> http://titans:8080/solr/update/extract/?literal.versionId=684936&literal.filingDate=1997-12-04T00:00:00Z&literal.formTypeId=95&literal.companyId=3567904&literal.sourceId=0&resource.name=684936.txt&commit=false
>
> Have you verified that all of your indexing jobs (you said you had 4
> or 5) have commit=false?
>
> Also make sure that your extract handler doesn't have a default of
> something that could cause a commit - like commitWithin or something.
>
> -Yonik
> http://www.lucidimagination.com
>
>
>
> On Mon, Oct 5, 2009 at 12:44 PM, Giovanni Fernandez-Kincade
> <gfernandez-kincade@capitaliq.com> wrote:
>> Is there somewhere other than solrConfig.xml that the autoCommit feature is enabled?
I've looked through that file and found autocommit to be commented out:
>>
>>
>>
>> <!--
>>
>>  Perform a <commit/> automatically under certain conditions:
>>
>>         maxDocs - number of updates since last commit is greater than this
>>
>>         maxTime - oldest uncommited update (in ms) is this long ago
>>
>>    <autoCommit>
>>
>>      <maxDocs>10000</maxDocs>
>>
>>      <maxTime>1000</maxTime>
>>
>>    </autoCommit>
>>
>>
>>
>>
>>
>>  -->
>>
>>
>>
>
>>
>>
>>
>> -----Original Message-----
>> From: Feak, Todd [mailto:Todd.Feak@smss.sony.com]
>> Sent: Monday, October 05, 2009 12:40 PM
>> To: solr-user@lucene.apache.org
>> Subject: RE: Solr Timeouts
>>
>>
>>
>> Actually, ignore my other response.
>>
>>
>>
>> I believe you are committing, whether you know it or not.
>>
>>
>>
>> This is in your provided stack trace
>>
>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor,
SolrParams, boolean) org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest,
SolrQueryResponse)
>>
>>
>>
>> I think Yonik gave you additional information for how to make it faster.
>>
>>
>>
>> -Todd
>>
>>
>>
>> -----Original Message-----
>>
>> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kincade@capitaliq.com]
>>
>> Sent: Monday, October 05, 2009 9:30 AM
>>
>> To: solr-user@lucene.apache.org
>>
>> Subject: RE: Solr Timeouts
>>
>>
>>
>> I'm not committing at all actually - I'm waiting for all 6 million to be done.
>>
>>
>>
>> -----Original Message-----
>>
>> From: Feak, Todd [mailto:Todd.Feak@smss.sony.com]
>>
>> Sent: Monday, October 05, 2009 12:10 PM
>>
>> To: solr-user@lucene.apache.org
>>
>> Subject: RE: Solr Timeouts
>>
>>
>>
>> How often are you committing?
>>
>>
>>
>> Every time you commit, Solr will close the old index and open the new one. If you
are doing this in parallel from multiple jobs (4-5 you mention) then eventually the server
gets behind and you start to pile up commit requests. Once this starts to happen, it will
cascade out of control if the rate of commits isn't slowed.
>>
>>
>>
>> -Todd
>>
>>
>>
>> ________________________________
>>
>> From: Giovanni Fernandez-Kincade [mailto:gfernandez-kincade@capitaliq.com]
>>
>> Sent: Monday, October 05, 2009 9:04 AM
>>
>> To: solr-user@lucene.apache.org
>>
>> Subject: Solr Timeouts
>>
>>
>>
>> Hi,
>>
>> I'm attempting to index approximately 6 million HTML/Text files using SOLR 1.4/Tomcat6
on Windows Server 2003 x64. I'm running 64 bit Tomcat and JVM. I've fired up 4-5 different
jobs that are making indexing requests using the ExtractionRequestHandler, and everything
works well for about 30-40 minutes, after which all indexing requests start timing out. I
profiled the server and found that all of the threads are getting blocked by this call to
flush the Lucene index to disk (see below).
>>
>>
>>
>> This leads me to a few questions:
>>
>>
>>
>> 1.       Is this normal?
>>
>>
>>
>> 2.       Can I reduce the frequency with which this happens somehow? I've greatly
increased the indexing options in SolrConfig.xml (attached here) to no avail.
>>
>>
>>
>> 3.       During these flushes, resource utilization (CPU, I/O, Memory Consumption)
is significantly down compared to when requests are being handled. Is there any way to make
this index go faster? I have plenty of bandwidth on the machine.
>>
>>
>>
>> I appreciate any insight you can provide. We're currently using MS SQL 2005 as our
full-text solution and are pretty much miserable. So far SOLR has been a great experience.
>>
>>
>>
>> Thanks,
>>
>> Gio.
>>
>>
>>
>> http-8080-Processor21 [RUNNABLE] CPU time: 9:51
>>
>> java.io.RandomAccessFile.seek(long)
>>
>> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[],
int, int)
>>
>> org.apache.lucene.store.BufferedIndexInput.refill()
>>
>> org.apache.lucene.store.BufferedIndexInput.readByte()
>>
>> org.apache.lucene.store.IndexInput.readVInt()
>>
>> org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos)
>>
>> org.apache.lucene.index.SegmentTermEnum.next()
>>
>> org.apache.lucene.index.SegmentTermEnum.scanTo(Term)
>>
>> org.apache.lucene.index.TermInfosReader.get(Term, boolean)
>>
>> org.apache.lucene.index.TermInfosReader.get(Term)
>>
>> org.apache.lucene.index.SegmentTermDocs.seek(Term)
>>
>> org.apache.lucene.index.DocumentsWriter.applyDeletes(IndexReader, int)
>>
>> org.apache.lucene.index.DocumentsWriter.applyDeletes(SegmentInfos)
>>
>> org.apache.lucene.index.IndexWriter.applyDeletes()
>>
>> org.apache.lucene.index.IndexWriter.doFlushInternal(boolean, boolean)
>>
>> org.apache.lucene.index.IndexWriter.doFlush(boolean, boolean)
>>
>> org.apache.lucene.index.IndexWriter.flush(boolean, boolean, boolean)
>>
>> org.apache.lucene.index.IndexWriter.closeInternal(boolean)
>>
>> org.apache.lucene.index.IndexWriter.close(boolean)
>>
>> org.apache.lucene.index.IndexWriter.close()
>>
>> org.apache.solr.update.SolrIndexWriter.close()
>>
>> org.apache.solr.update.DirectUpdateHandler2.closeWriter()
>>
>> org.apache.solr.update.DirectUpdateHandler2.commit(CommitUpdateCommand)
>>
>> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(CommitUpdateCommand)
>>
>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(UpdateRequestProcessor,
SolrParams, boolean)
>>
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest,
SolrQueryResponse)
>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(SolrQueryRequest, SolrQueryResponse)
>>
>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(SolrQueryRequest,
SolrQueryResponse)
>>
>> org.apache.solr.core.SolrCore.execute(SolrRequestHandler, SolrQueryRequest, SolrQueryResponse)
>>
>> org.apache.solr.servlet.SolrDispatchFilter.execute(HttpServletRequest, SolrRequestHandler,
SolrQueryRequest, SolrQueryResponse)
>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(ServletRequest, ServletResponse,
FilterChain)
>>
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
ServletResponse)
>>
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest, ServletResponse)
>>
>> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
>>
>> org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
>>
>> org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
>>
>> org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
>>
>> org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
>>
>> org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
>>
>> org.apache.coyote.http11.Http11Processor.process(InputStream, OutputStream)
>>
>> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(TcpConnection,
Object[])
>>
>> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(Socket, TcpConnection, Object[])
>>
>> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(Object[])
>>
>> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run()
>>
>> java.lang.Thread.run()
>>
>>
>>
>>
>>
>>
>>
>

Mime
View raw message