lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: Clarification on deletion process...
Date Tue, 12 Aug 2008 10:07:33 GMT
Some more details below...

<Aravind.Yarram@equifax.com> wrote:
> The documentation for delete operation seems to be confusing (i am going
> thru the book and also posted in the books forums...), so appreciate if
> someone can let me know if my below understanding is correct.
>
> When i delete a document from the index
>
> 1) It is marked for deletion in the BUFFER until I commit/close the
> writer. Does that mean the document is still visible for the Searcher?

Right, IndexWriter simply records the fact that you want to delete all
docs matching query X or term Y, in RAM.

> 2) Once i commit/close the writer then IT IS JUST MARKED for delete in the
> Index. At this time the document is NOT visible for the Searcher, but the
> document is still taking up the space in the index.

Yes, every so often (or, when you explicitly commit or close)
IndexWriter will translate the buffered delete requests into _X_N.del
files, which record exactly which docIDs are now deleted.  If you
reopen a searcher after this point the documents won't be seen.

> 3) Once the index is merged (optimized), it is removed from the index

As Hoss said, ordinary merges also reclaim the space consumed by
deleted docs.  You can also call expungeDeletes, which forces any
segments containing deletions to be merged.

Note that with ConcurrentMergeScheduler, ordinary merges are kicked
off and complete in background threads.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message