Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 900FA780E for ; Mon, 1 Aug 2011 12:04:59 +0000 (UTC) Received: (qmail 99887 invoked by uid 500); 1 Aug 2011 12:04:57 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 99625 invoked by uid 500); 1 Aug 2011 12:04:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 99615 invoked by uid 99); 1 Aug 2011 12:04:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Aug 2011 12:04:49 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of simon.willnauer@googlemail.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-gw0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Aug 2011 12:04:43 +0000 Received: by gwj22 with SMTP id 22so5096014gwj.35 for ; Mon, 01 Aug 2011 05:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=46GRUoXu1hjsgmhNqXrz3CIqFFd3vv8Yk7PgWO6F/jQ=; b=w6S2Bjw86daLFOq8MX/m4/tCixWJbUyqnseNPY9nzhlHpdkCkuzdELpGh2+qDpOdZe ny5NLa9w0my/pkYYt8aYthIFl8A+lMtk2kEOEoVF/kyr94etXSLfKejFrg56W3cjfU5b KYEj14IsEzF9m0KgXfcJ+2nWM7LCL1joy9y9A= MIME-Version: 1.0 Received: by 10.236.115.230 with SMTP id e66mr2901737yhh.28.1312200261585; Mon, 01 Aug 2011 05:04:21 -0700 (PDT) Received: by 10.147.124.17 with HTTP; Mon, 1 Aug 2011 05:04:21 -0700 (PDT) Reply-To: simon.willnauer@gmail.com In-Reply-To: <1312153054.20146.YahooMailNeo@web65912.mail.ac4.yahoo.com> References: <008001cc4ba7$9d912ae0$d8b380a0$@thetaphi.de> <8CE19C08BDB5626-D54-3B44@FRR2-L18./f> <1312153054.20146.YahooMailNeo@web65912.mail.ac4.yahoo.com> Date: Mon, 1 Aug 2011 14:04:21 +0200 Message-ID: Subject: Re: Closing IndexWriter can be very slow on large indexes From: Simon Willnauer To: java-user@lucene.apache.org, kiwi clive Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Aug 1, 2011 at 12:57 AM, kiwi clive wrote: > Hi Mike, > > The problem was due to close().=C2=A0 A shutdown was calling close() whic= h seems to cause lucene to perform a merge. For a busy very large index (wi= th lots of deletes and updates), the merge process could take a very long t= ime to complete (hours). Calling close(false) solved the problem as this ap= pears to close the index without performing the merge. At least that is my = understanding of things ! > passing false to IW#close(boolean) will prevent the close call to block on merges. If there are background merges in flight those merges will be performed nevertheless. While this will not corrupt your index you will have dead files lurking around in your index directory if you shutdown you app and background threads are killed. Basically, if you call close IW will flush its internal ram buffer to disk creating one new segment (Lucene 3.x) and possibly multiple new segments (lucene 4.0). This flush process can take up some time too plus this flush can trigger a new merge too. Passing false to IW#close(boolean) will also prevent the IW from kicking off a new merge due to the flushed segment(s). simon > > Clive > > > > ----- Original Message ----- > From: Michael McCandless > To: java-user@lucene.apache.org > Cc: > Sent: Tuesday, July 26, 2011 5:30 PM > Subject: Re: Closing IndexWriter can be very slow on large indexes > > Which method (abort or close) do you see taking so much time? > > It's odd, because IW.abort should quickly stop any running BG merges. > > Can you get a dump of the thread stacks during this long abort/close > and post that back? > > Can't answer if Lucene 3.x will improve this situation until we find > the source of the slowness... > > Mike McCandless > > http://blog.mikemccandless.com > > On Tue, Jul 26, 2011 at 11:33 AM, Chris Bamford > wrote: >> Hi >> >> I think I must be doing something wrong, but not sure what. >> >> I have some long running indexing code which sometimes needs to be shutd= own in a hurry. =C2=A0To achieve this, I set a shutdown flag which causes i= t to break from the loop and call first abort() and then close(). =C2=A0The= problem is that with a large index (say, 15Gb) in Lucene 2.3.2, it can tak= e over an hour. =C2=A0(Yes, I know I should be on a later version of Lucene= , but that's another issue - we are stuck with this for now!). >> >> The IW is opened in autoCommit mode and mergeFactor=3D10. >> >> During this closedown stage, the indexes are being constantly updated by= Lucene itself, making me suspect it could be merging. >> >> Firstly, can someone explain what it is doing under the covers that take= s so long? (And any action I can take to get around it) >> >> Second, if I were to rebuild the code with say, Lucene 3 and run it in c= ompatibility mode with the 2.3.2 indexes, would I have a richer set of tool= s I could use to overcome the issue? >> >> Thanks, >> >> - Chris >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org