lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua O'Madadhain" <jmad...@ics.uci.edu>
Subject Re: delete / optimize question
Date Wed, 18 Sep 2002 00:48:34 GMT
On Tue, 17 Sep 2002, Tom Mortimer wrote:

> I'm new-ish to Lucene, and having a few problems with document deletion.  
> In particular, the point at which a deleted document is no longer visible to
> an IndexReader. Is the following scenario sane?
> 
> 1. Open an IndexReader and delete all docs with Term("name", "Bob"), then
>    close the reader.
> 
> 2. Open an IndexWriter and add various "non-Bob" documents
> 
> 3. then add a new document with a Term("name", "Bob").  
> 
> 4. Call optimize()
> 
> 5. Open an IndexReader and get docFreq for "Bob"
> 
> I'd expect the final doc freq to be 1, as all the "Bob" docs should have
> been deleted except for the one added in step 3, but instead I'm getting
> freq > 1.  
> 
> Do I in fact need to do the optimize() step immediately after the deletions,
> and before adding any more docs?  This could be expensive with a large
> index, and document additions and deletions required in random order.

I *think* that what you think should happen is what indeed should be
happening.  However, I have a few suggestions for sanity checks to make
sure that you're seeing what you think you're seeing:

(1) Call docFreq("Bob") before and after each step.  This will make sure
that (for instance) your supposed "non-Bob" documents are indeed
"non-Bob", which if false would be a subtle sort of "gotcha".

(2) Double-check that you use the same index in all cases.

(3) Are you closing the index before you call optimize()?  (According to
the docs, you shouldn't.)

(4) Are you closing the IndexWriter after you call optimize() and before
you call docFreq()?  (close() does flush changes, although I don't know
whether it should be necessary after optimize().)

Anyway, good luck.

Joshua

 jmadden@ics.uci.edu...Obscurium Per Obscurius...www.ics.uci.edu/~jmadden
  Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
 It's that moment of dawning comprehension that I live for--Bill Watterson
My opinions are too rational and insightful to be those of any organization.




--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message