lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <...@teamware.com>
Subject deleteDocuments by Term[] for ALL terms
Date Mon, 26 Nov 2007 05:52:54 GMT
Hi,

I'm using IndexReader.deleteDocuments(Term) to delete documents in batches.  I 
need the deleted count, so I cannot use IndexWriter.deleteDocuments().

What I want to do is delete documents based on more than one term, but not like 
IndexWriter.deleteDocuments(Term[]) which deletes all documents with ANY term. 
I want it to delete documents which have ALL terms, e.g.

Term("owner", "ownerUID") AND Term("subject", "something");

I have a new reader being used, so I could make a new IndexSearcher and query 
the documents with all terms in a BooleanQuery and then iterate the results, but 
that would either mean using the Hits mechanism or TopDocs and seems like a 
heavyweight way to do things.

What I want to be able to do is to delete sequentially without storing up a 
result set as I may want to delete all and ALL may be rather big.

I see the implementation uses reader.termDocs() to do the deletion for a single 
Term, which of course is easy, but is there a simple way to make a deletion for 
multiple terms with AND via the reader using, say termDocs, that will not 
potentially use large amounts of memory, or should I just go with the searcher 
TopDocs mechanism and do that also in batches to avoid the risk of a large 
memory hit.

I know there's lots of clever 'expert-mode' stuff under the Lucene API hood, but 
does anyone know any good way to do this or have I missed anything obvious in 
the API docs?

Thanks
Antony




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message