lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: IndexWriter.deleteDocuments(Query query)
Date Wed, 01 Apr 2009 18:27:48 GMT
On Wed, Apr 1, 2009 at 2:04 PM, John Wang <john.wang@gmail.com> wrote:

> My test essentially this. I took out the reader.deleteDocuments call from
> both scenarios. I took a index of 5m docs. a batch of 10000 randomly
> generated uids.
>
> Compared the following scenarios:
> 1)
> * open index reader
> * for each uid in the batch, find the corresponding docid and add to an
> IntList.
> *close reader

How exactly do you find the corresponding docid?  TermDocs?

> 2)
> * open index reader
> * load uid array from payload field
> * iterate uid array, and check to see if uid is in deleted set, and add to
> an IntList

In this case, each doc has a dedicated field that only has a payload
that stores the one uid for that doc?  But I'm confused how you then
map from uid -> docID.  I must be missing something.

> The datastructure holding deleted set is IntOpenHashSet from fastutil.
>
> 1) took about 3500 - 4500 ms
> 2) took about 815 ms

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message