lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Closing IndexWriter can be very slow on large indexes
Date Mon, 01 Aug 2011 12:04:21 GMT
On Mon, Aug 1, 2011 at 12:57 AM, kiwi clive <kiwi_clive@yahoo.com> wrote:
> Hi Mike,
>
> The problem was due to close().  A shutdown was calling close() which seems to cause
lucene to perform a merge. For a busy very large index (with lots of deletes and updates),
the merge process could take a very long time to complete (hours). Calling close(false) solved
the problem as this appears to close the index without performing the merge. At least that
is my understanding of things !
>

passing false to IW#close(boolean) will prevent the close call to
block on merges. If there are background merges in flight those merges
will be performed nevertheless. While this will not corrupt your index
you will have dead files lurking around in your index directory if you
shutdown you app and background threads are killed.

Basically, if you call close IW will flush its internal ram buffer to
disk creating one new segment (Lucene 3.x) and possibly multiple new
segments (lucene 4.0). This flush process can take up some time too
plus this flush can trigger a new merge too. Passing false to
IW#close(boolean) will also prevent the IW from kicking off a new
merge due to the flushed segment(s).

simon
>
> Clive
>
>
>
> ----- Original Message -----
> From: Michael McCandless <lucene@mikemccandless.com>
> To: java-user@lucene.apache.org
> Cc:
> Sent: Tuesday, July 26, 2011 5:30 PM
> Subject: Re: Closing IndexWriter can be very slow on large indexes
>
> Which method (abort or close) do you see taking so much time?
>
> It's odd, because IW.abort should quickly stop any running BG merges.
>
> Can you get a dump of the thread stacks during this long abort/close
> and post that back?
>
> Can't answer if Lucene 3.x will improve this situation until we find
> the source of the slowness...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Tue, Jul 26, 2011 at 11:33 AM, Chris Bamford
> <chris.bamford@talktalk.net> wrote:
>> Hi
>>
>> I think I must be doing something wrong, but not sure what.
>>
>> I have some long running indexing code which sometimes needs to be shutdown in a
hurry.  To achieve this, I set a shutdown flag which causes it to break from the loop and
call first abort() and then close().  The problem is that with a large index (say, 15Gb)
in Lucene 2.3.2, it can take over an hour.  (Yes, I know I should be on a later version of
Lucene, but that's another issue - we are stuck with this for now!).
>>
>> The IW is opened in autoCommit mode and mergeFactor=10.
>>
>> During this closedown stage, the indexes are being constantly updated by Lucene itself,
making me suspect it could be merging.
>>
>> Firstly, can someone explain what it is doing under the covers that takes so long?
(And any action I can take to get around it)
>>
>> Second, if I were to rebuild the code with say, Lucene 3 and run it in compatibility
mode with the 2.3.2 indexes, would I have a richer set of tools I could use to overcome the
issue?
>>
>> Thanks,
>>
>> - Chris
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message