lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen" <>
Subject SegmentReader with custom setting of deletedDocs, single reusable FieldsReader
Date Tue, 24 Jun 2008 12:01:53 GMT
One of the bottlenecks I have noticed testing Ocean realtime search is the
delete process which involves writing several files for each possibly single
delete of a document in SegmentReader.  The best way to handle the deletes
is too simply keep them in memory without flushing them to disk, saving on
writing out an entire BitVector per delete.  The deletes are saved in the
transaction log which is be replayed on recovery.

I am not sure of the best way to approach this, perhaps it is creating a
custom class that inherits from SegmentReader.  It could reuse the existing
reopen and also provide a way to set the deletedDocs BitVector.  Also it
would be able to reuse FieldsReader by providing locking around FieldsReader
for all SegmentReaders of the segment to use.  Otherwise in the current
architecture each new SegmentReader opens a new FieldsReader which is
non-optimal.  The deletes would be saved to disk but instead of per delete,
periodically like a checkpoint.

View raw message