lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Getting the document number (with IndexReader)
Date Thu, 26 Jan 2006 18:44:37 GMT

: > The document number is the variable i in this case.
: If the document number is the variable i (enumerated from numDocs()),
: what's the difference between numDocs() and maxDoc() in this case? I
: was previously under the impression that the internal docNum might be
: different to the counter.

Iterating between 1 and maxDoc-1 will give you the range of all possible
doc ids, but some of those docs may have already been deleted.  I believe
that is what you want to do. ... you can check if a doc is deleted using
IndexReader.isDeleted(i)

numDocs is implimented as maxDocs() - deletedDocs.count(), so i don't
think it ever makes sese to iterate up to numDocs.

: I'm doing something akin to a rangeQuery, where I delete documents
: within a certain range (in addition to other criteria). Is it better
: to do a query on the range, mark all the docNums getting them with
: Hits.id(), and then retrieve docs and test for deletion according to
: that?

Take a look at the way RangeFilter.bits() is implimented.  if you
cut/paste that code and replace the call to bits.set(termDocs.doc()); with
reader.delete(termDocs.doc()) I think you've have exactly what you want.

Or, since cutting/pasting code is "A Bad Thing" from a maintenence/bug
fixing standpoint, you could just call RangeFilter.bits(reader) yourself,
and then iterate of the set bits and call delete on each one.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message