lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <dan...@nuix.com>
Subject "Deleting" documents without deleting them
Date Tue, 16 Mar 2010 04:20:05 GMT
Hi all.

I'm trying to implement a form of document deletion where the previous
versions are kept around forever ( a primitive form of versioning) but
excluded from the search results.

I notice that after calling IndexWriter.deleteDocuments, even if you
close and reopen the index, the documents are still accessible using
document(int) but are returned from queries, which is exactly the
behaviour I want.  However, if I call optimize() they will obviously
be obliterated.

My question is: as long as I never call optimize() -- will the deleted
documents hang around forever, or will a merge due to adding the new
documents eventually cause them to be removed?

If they will be removed then I need some other way to avoid them being
returned.  I was thinking of actually *not* deleting them, but
maintaining a giant filter - I could store this filter on disk but
it's going to be pretty large even if I use a BitSet. :-(   Is there
any other way to go about it?

Daniel




-- 
Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis
http://nuix.com/                                and eDiscovery software

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message