lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkat Rangan" <>
Subject MergePolicy option to retain deleted docID positions
Date Tue, 14 Jul 2009 16:08:36 GMT


We modified Lucene 2.4.1 sources to add a MergePolicy option to insert
empty documents in places where a document was deleted during the merge
phase. The motivation was to retain the same document locations (docID)
when there are deletions, so an external tracking database can persist
the docIDs and not be concerned about the IDs becoming invalid upon a
merge. This option in combination with a "no-optimize" allows an index
to be tolerant of a small number of deletions without requiring another
parallel immutable ID space. While it is possible to create a field
called ID within each document which is immutable, large retrievals
require a relatively expensive search of the IDs, which an immutable ID
space will avoid. This also helps in constructing a filter bit map to a
search, where the bit map was created using other business logic.


Our implementation has added a few unit tests to confirm its correct


Is there an interest in pursuing this as a useful capability?




Venkat Rangan

Clearwell Systems Inc.

(650) 526 0639 <> 

- Delivering  Intelligent eDiscovery


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message