lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Smith (JIRA)" <>
Subject [jira] Commented: (LUCENE-1703) Add a waitForMerges() method to IndexWriter
Date Fri, 19 Jun 2009 21:46:07 GMT


Tim Smith commented on LUCENE-1703:

NOTE: I'm always using autoCommit=false (autoCommit=true is deprecated anyway)

however, i could potentially have 2 threads feeding the index (in my custom code)
one thread may call addDocument() (or maybeMerge() to be more to the point)
this thread could result in the SerialMergeScheduler to start merging (addDocument() won't
return until this merge completes)
I then want thread 2 to call waitForMerges(), at which point it will wait till the first thread
will have finished its merges (at which point addDocument will have returned)

Obviously this is a contrived example as i personally will be locking the updates such that
no addDocument() call could be in process when i want to call waitForMerges(), however this
situation points out that even the SerialMergeScheduler should have an actual implementation
for a sync() method, which would block until the thread actually doing the merge has completed.
(as i may be calling sync() from a different thread other than the one the IndexWriter called
merge() on) SerialMergeScheduler should therefore have a lock that will be held while merging,
and a sync() method should be added that will just acquire and release the lock. Making both
the sync() and merge() methods on the SerialMergeScheduler would achieve this (and the sync
would just be a synchronized noop)

It seems more natural to me to put this "sync" on the IndexWriter itself, especially as this
will be completely agnostic to the merge scheduler used.

for the "periodic" waiting for merge thread completion, this would be driven by messages from
client code to request a "soft optimize" perhaps, which would just wait for background merges
to complete. This could then result in turning over a new IndexReader for more efficient searches
than using the old IndexReader (which may be more segmented). This message asking for a "soft
optimize" may be sent on some scheduled basis in order to achieve better search performance
(without the cost of an explicit optimize)

Discussion is all well and good, and i definitely appreciate all comments.
Even if this doesn't end up going in, you've pointed out another solution (using expungeDeletes())
which will achieve the same solution for me at least.

> Add a waitForMerges() method to IndexWriter
> -------------------------------------------
>                 Key: LUCENE-1703
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.4
>            Reporter: Tim Smith
>         Attachments:,
> It would be very useful to have a waitForMerges() method on the IndexWriter.
> Right now, the only way i can see to achieve this is to call IndexWriter.close()
> ideally, there would be a method on the IndexWriter to wait for merges without actually
closing the index.
> This would make it so that background merges (or optimize) can be waited for without
closing the IndexWriter, and then reopening a new IndexWriter
> the close() reopen IndexWriter method can be problematic if the close() fails as the
write lock won't be released
> this could then result in the following sequence:
> * close() - fails
> * force unlock the write lock (per close() documentation)
> * new IndexWriter() (acquires write lock)
> * finalize() on old IndexWriter releases the write lock
> * Index is now not locked, and another IndexWriter pointing to the same directory could
be opened
> If you don't force unlock the write lock, opening a new IndexWriter will fail until garbage
collection calls finalize() the old IndexWriter
> If the waitForMerges() method is available, i would likely never need to close() the
IndexWriter until right before the process being shutdown, so this issue would not occur (worst
case scenario, the waitForMerges() fails)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message