lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roy Klein" <kl...@sitescape.com>
Subject RE: Update performance/indexwriter.delete()?
Date Thu, 14 Apr 2005 17:25:02 GMT
Hi,

I guess I didn't ask my question very well.  I do understand that you can
only do a delete via a reader based on the current sources, what I don't
understand is why the delete function couldn't be incorporated into a
writer, so that updates could be all done within the context of a writer?

For instance, I need to update 3 documents that already exist in the index.
With the current mechanism I have to:
1) open a reader, delete doc1, close the reader.
2) open a writer, add doc1, close the writer.
3) open a reader, delete doc2, close the reader.
4) open a writer, add doc2, close the writer.
5) open a reader, delete doc3, close the reader.
6) open a writer, add doc3, close the writer.

Note: any given delete might be a term delete, which means it might delete
one of the previously added docs)


That's a lot of opens and closes.


If the update function was incorporated into a writer, the steps could be:
1) open a writer
2) delete doc1, add doc1
3) delete doc2, add doc2
4) delete doc3, add doc3
5) close the writer

The added benefit is that I can optimize (via batching) all the updates,
because all the previously added docs are recognized by the writer.

I think this is a better way of asking my original questions:
	"Why was this designed this way?"
	"Can it be changed to optimize updates?"


Thanks

Roy

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org]
Sent: Thursday, April 14, 2005 12:08 PM
To: java-user@lucene.apache.org
Subject: Re: Update performance/indexwriter.delete()?


Roy Klein wrote:
> So one thing I've been wondering:  Why do you need to do deletes from an
> indexreader?

Is this not in the FAQ?  It should be...

IndexWriter can only append documents to an index.

An IndexReader is required to, given a term, find the document number to
mark deleted.

Also, in the current sources the cost of opening an IndexReader has been
greatly reduced.  Now the norms and the term index are read lazily, so
that only a few tiny files are actually read when an IndexReader is
opened, the rest are merely opened.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message