lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khawaja Shams" <kssh...@gmail.com>
Subject Re: Which is faster/better
Date Tue, 25 Nov 2008 17:59:34 GMT
On Tue, Nov 25, 2008 at 8:42 AM, Grant Ingersoll <gsingers@apache.org>wrote:

>
> On Nov 25, 2008, at 10:46 AM, Michael McCandless wrote:
>
>  If you already have the docId, would you need to/want to do
>>> delete-by-Query or even delete-by-Term?  Isn't delete-by-id a lot lighter
>>> weight since it only marks the the doc as deleted, where as d-b-Q can
>>> potentially force a flush, etc?
>>>
>>
>> I guess the question is how you got that docID in the first place?  If
>> you got it by running a query, and deleting all docIDs that are
>> returned, then you could dBQ instead?
>>
>
> User does a search.  Gets back a set of docs.  Picks docs to delete,
> deletes them.


Grant, can we assume that the document id will remain consistent from the
time user obtained the result and when they click delete? I was under the
impression that the document ids can change on optimize, etc.

>
>
>
>>
>> Lucene's (IndexWriter's) dbQ doesn't force a flush: it's buffered just
>> like other deletes and then applied in bulk at certain times.  When
>> autoCommit is false, currently the deletes are applied when a
>> merge wants to start (ie not at each segment flush).  Or, if you call
>> commit().
>>
>
> I was just going based of the code of the two:  In the IndexReader, all
> it's doing is marking a bit in a bit vector, right?  Whereas in the
> IndexWriter, it's checking if it's a time to flush, etc.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message