lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ning Li <>
Subject Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Date Thu, 06 Jul 2006 21:12:26 GMT
> Even with you code changes, to see the modification made using the
> IndexWriter, it must be closed, and a new IndexReader opened.

That behaviour remains the same.

> So a far simpler way is to get the collection of updates first, then
> using opened indexreader,
> for each doc in collection
>        delete document using "key"
> endfor
> open indexwriter
> for each doc in collection
>        add document
> endfor
> open indexreader

So, you are buffering the updates into large batches. This patch
improves performance for small batches.

There are several advantages in supporting deletes in IndexWriter:

1 Applications don't have to worry about how each of them should buffer
  inserts/deletes into large batches. IndexWriter takes care of that.
2 deleteDocuments(Term)/batchDeleteDocuments(Terms[]) supported by
  IndexWriter will be as general as deleteDocuments(Term) supported by
  IndexReader. No concept of a "key" is necessary.
3 If an application reopens the index after your batched deletes but
  before your batched inserts, some previously available documents will
  "disappear" (see
  Supporting deletes in IndexWriter will eliminate this problem.
4 When IndexWriter supports deletes, a concurrent merge thread is
  possible and makes sense. A concurrent merge thread means having a
  separate thread dedicated to merging segments. Today, when a merge
  of large segments (or a cascade of merges) is started, no documents
  can be inserted before the merge(s) finish(es). A concurrent merge
  thread will eliminate this problem. In addition, on a machine with
  sufficient CPU resources, this will improve the insert/delete
  performance not only for small insert/delete batches, but also for
  large batches. I have coded this and experiments have verified the
  claims. I will make it available if people are interested.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message