lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless" <luc...@mikemccandless.com>
Subject Re: deleteDocuments by Term[] for ALL terms
Date Mon, 26 Nov 2007 10:49:16 GMT

You can just create a query with your and'd terms, and then do this:

  Weight weight = query.weight(indexSearcher);
  IndexReader reader = indexSearcher.getIndexReader();
  Scorer scorer = weight.scorer(reader);
  int delCount = 0;
  while(scorer.next()) {
    reader.deleteDocument(scorer.doc());
    delCount++;
  }

that iterates over all the docIDs without scoring them and without
building up a Hit for each, etc.

Mike

"Antony Bowesman" <adb@teamware.com> wrote:
> Hi,
> 
> I'm using IndexReader.deleteDocuments(Term) to delete documents in
> batches.  I 
> need the deleted count, so I cannot use IndexWriter.deleteDocuments().
> 
> What I want to do is delete documents based on more than one term, but
> not like 
> IndexWriter.deleteDocuments(Term[]) which deletes all documents with ANY
> term. 
> I want it to delete documents which have ALL terms, e.g.
> 
> Term("owner", "ownerUID") AND Term("subject", "something");
> 
> I have a new reader being used, so I could make a new IndexSearcher and
> query 
> the documents with all terms in a BooleanQuery and then iterate the
> results, but 
> that would either mean using the Hits mechanism or TopDocs and seems like
> a 
> heavyweight way to do things.
> 
> What I want to be able to do is to delete sequentially without storing up
> a 
> result set as I may want to delete all and ALL may be rather big.
> 
> I see the implementation uses reader.termDocs() to do the deletion for a
> single 
> Term, which of course is easy, but is there a simple way to make a
> deletion for 
> multiple terms with AND via the reader using, say termDocs, that will not 
> potentially use large amounts of memory, or should I just go with the
> searcher 
> TopDocs mechanism and do that also in batches to avoid the risk of a
> large 
> memory hit.
> 
> I know there's lots of clever 'expert-mode' stuff under the Lucene API
> hood, but 
> does anyone know any good way to do this or have I missed anything
> obvious in 
> the API docs?
> 
> Thanks
> Antony
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message