Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 89851 invoked from network); 23 Apr 2010 20:38:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Apr 2010 20:38:59 -0000 Received: (qmail 32226 invoked by uid 500); 23 Apr 2010 20:38:56 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 32181 invoked by uid 500); 23 Apr 2010 20:38:56 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 32169 invoked by uid 99); 23 Apr 2010 20:38:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Apr 2010 20:38:56 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.190.38.56] (HELO web50302.mail.re2.yahoo.com) (206.190.38.56) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 23 Apr 2010 20:38:49 +0000 Received: (qmail 8029 invoked by uid 60001); 23 Apr 2010 20:38:28 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1272055108; bh=vzRG8+EzBIXzJme0zDbCMzKgu47bVrf3/PrX6I0XeGU=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=nlXgR8sjCS1iakOyf+2Iy4L1N9qzu0carYonyfqautgi0K8mRAVNN3ieBP0ikq5BaWXRJmxKex40sGP5UddCeoYeRZdz7AbTQPtiQ+gj/6p86df2Nh66cM5p4G4/VJ9arhVsSYWi4QyHHsdYWLZq/0EZprkdescozzNxfbtd3gE= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=pZUsgBeNT9g21TkLeBFLmLnyzN99Bj3pFTh5Qi+1c5xpNB9MIsQO2OvdRaibly5lhZpyQu9PZb2RUOqXepPEyphLIKjPDC1w/qt+cusv6F+5j6m0h8GImNzBNjNfB3ahliy27GLIhiwHBoJ9XywVB1pfx9HURhWDJqNcU5srqj0=; Message-ID: <32560.5068.qm@web50302.mail.re2.yahoo.com> X-YMail-OSG: wDWD7r8VM1ldxsPAOmqBdI2_Sk5RAonO0Ff3nxxAoInRRoK 8jAPZ_1L7nAigoT6hKz7RyF7Nas82f7vhm1R.GSfPQJCbCfyzJy4P0FEYHLi N303v9EO02DB657beV_tgHLxwGFnDfvFCJ2NtQytiHP1kRMPgDRwfIaPrOU_ Es2j1jOgXO_UCoQoY961TNossdKfGh5l_u.wUABIOllOC2YZO6L7N9SGSyCl RWliEkFLZbGG0hrPLevlpfDj5mJ70ptvV69CuKoSOIBABONFIVaZ.8WpnVYs e9ZF_r2ec6luWW0nSUNrxnlk7URG0aFKvj3QT_LMk015kFyXJhuyjb7piQwB 7dw-- Received: from [74.73.1.126] by web50302.mail.re2.yahoo.com via HTTP; Fri, 23 Apr 2010 13:38:27 PDT X-Mailer: YahooMailRC/348.5 YahooMailWebService/0.8.103.269680 References: Date: Fri, 23 Apr 2010 13:38:27 -0700 (PDT) From: Otis Gospodnetic Subject: Re: Best way to prevent this search lockup (apparently caused during big segment merges)? To: solr-user@lucene.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org Chris, It looks like Mike already offered several solutions.... though I don't know what Solr does without looking at the code. But I'm curious: * how big is your index? and do you know how large the segments being merged are? * do you batch docs or do you make use of Streaming SolrServer? I'm curious, because I've never encountered this problem before... Thanks, Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Chris Harris > To: solr-user@lucene.apache.org > Sent: Thu, April 22, 2010 6:28:29 PM > Subject: Best way to prevent this search lockup (apparently caused during big segment merges)? > > I'm running Solr 1.4+ under Tomcat 6, with indexing and searching requests > simultaneously hitting the same Solr machine. Sometimes Solr, Tomcat, and my > (C#) indexing process conspire to render search inoperable. So far I've only > noticed this while big segment merges (i.e. merges that take multiple > minutes) are taking place. Let me explain the situation as best as I > understand it. My indexer has a main loop that looks roughly like > this: while true: try: > submit a new add or delete request to Solr via HTTP catch > timeoutException: sleep a few seconds When things > are going wrong (i.e., when a large segment merge is happening), this loop is > problematic: * When the indexer's request hits Solr, then the > corresponding thread in Tomcat blocks. (It looks to me like the thread is > destined to block until the entire merge is complete. I'll paste in what the > Java stack traces look like at the end of the message if they can help > diagnose things.) * Because the Solr thread stays blocked for so long, > eventually the indexer hits a timeoutException. (That is, it gives up on > Solr.) * Hitting the timeout exception doesn't cause the corresponding > Tomcat thread to die or unblock. Therefore, each time through the > loop, another Solr-handling thread inside Tomcat enters a blocked state. * > Eventually so many threads (maxThreads, whose Tomcat default is 200) are > blocked that Tomcat starts rejecting all new Solr HTTP requests -- including > those coming in from the web tier. * Users are unable to search. The problem > might self-correct once the merge is complete, but that could be quite a > while. What are my options for changing Solr settings or changing my > indexing process to avoid this lockup scenario? Do you agree that the > segment merge is helping cause the lockup? Do adds and deletes really need > to block on segment merges? Partial thread dumps follow, showing > example add and delete threads that are blocked. Also the active Lucene Merge > Thread, and the thread that kicked off the merge. [doc deletion > thread, waiting for DirectUpdateHandler2.iwCommit.lock() to > return] "http-1800-200" daemon prio=6 tid=0x000000000a58cc00 > nid=0x1028 waiting on condition > [0x000000000f9ae000..0x000000000f9afa90] java.lang.Thread.State: > WAITING (parking) at sun.misc.Unsafe.park(Native > Method) - parking to wait for > <0x000000016d801ae0> > (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(Unknown > Source) at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(Unknown Source) > at > org.apache.solr.update.DirectUpdateHandler2.deleteByQuery(DirectUpdateHandler2.java:320) > at > org.apache.solr.update.processor.RunUpdateProcessor.processDelete(RunUpdateProcessorFactory.java:71) > at > org.apache.solr.handler.XMLLoader.processDelete(XMLLoader.java:234) > at > org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:180) > at > org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) > at java.lang.Thread.run(Unknown Source) [doc adding thread, waiting for > DirectUpdateHandler2.iwAccess.lock() to return] "http-1800-70" daemon prio=6 > tid=0x0000000007946400 nid=0x590 waiting on condition > [0x000000000d76e000..0x000000000d76f910] java.lang.Thread.State: > WAITING (parking) at sun.misc.Unsafe.park(Native > Method) - parking to wait for > <0x000000016d801ae0> > (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(Unknown > Source) at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(Unknown Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(Unknown Source) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(Unknown Source) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:211) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:118) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:123) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:192) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at > org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849) > at > org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) > at > org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454) > at java.lang.Thread.run(Unknown Source) All 200 front-line Solr Tomcat > threads, from http-1800-1 through http-1800-200, are blocked in one of these > two ways. [The thread that kicked off the commit/merge. Note > that DirectUpdateHandler2.commit() calls iwCommit.lock(), and seems not > to release the iwCommit lock until DIH2.commit() > returns.] "pool-15-thread-1" prio=6 tid=0x000000000abc3c00 nid=0x1094 > in Object.wait() [0x000000000b6ff000..0x000000000b6ff910] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) - waiting on > <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:5351) > - locked <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:3479) > - locked <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.finishMerges(IndexWriter.java:3463) > - locked <0x00000001b1ba84a0> (a > org.apache.solr.update.SolrIndexWriter) at > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2200) > at > org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2153) > at > org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2117) > at > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:230) > at > org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:181) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:409) > at > org.apache.solr.update.DirectUpdateHandler2$CommitTracker.run(DirectUpdateHandler2.java:602) > - locked <0x000000016d801760> > (a org.apache.solr.update.DirectUpdateHandler2$CommitTracker) > at java.util.concurrent.Executors$RunnableAdapter.call(Unknown > Source) at > java.util.concurrent.FutureTask$Sync.innerRun(Unknown > Source) at java.util.concurrent.FutureTask.run(Unknown > Source) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) at > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) at java.lang.Thread.run(Unknown > Source) [The only active Lucene merge thread] "Lucene Merge Thread #0" > daemon prio=6 tid=0x000000000930ac00 nid=0xb38 runnable > [0x000000000a41e000..0x000000000a41f790] java.lang.Thread.State: > RUNNABLE at java.io.RandomAccessFile.readBytes(Native > Method) at java.io.RandomAccessFile.read(Unknown > Source) at > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(SimpleFSDirectory.java:132) > - locked <0x00000001b443e588> > (a org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor) > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157) > at > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) > at > org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80) > at > org.apache.lucene.index.SegmentTermPositions.readDeltaPosition(SegmentTermPositions.java:73) > at > org.apache.lucene.index.SegmentTermPositions.nextPosition(SegmentTermPositions.java:69) > at > org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:707) > at > org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:648) > at > org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:586) > at > org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:154) > at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5029) > at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4614) > at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235) > at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)