lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Date Thu, 06 Jul 2006 21:34:46 GMT
I guess I don't see the difference...

You need the write lock to use the indexWriter, and you also need the  
write lock to perform a deletion, so if you just get the write lock  
you can perform the deletion and the add, then close the writer.

I have asked how this submission optimizes anything, and I still  
can't seem to get an answer?


On Jul 6, 2006, at 4:27 PM, Otis Gospodnetic wrote:

> I think that patch is for a different scenario, the one where you  
> can't wait to batch deletes and adds, and want/need to execute them  
> more frequently and in order they really are happening, without  
> grouping them.
>
> Otis
>
> ----- Original Message ----
> From: robert engels <rengels@ix.netcom.com>
> To: java-dev@lucene.apache.org
> Sent: Thursday, July 6, 2006 3:24:13 PM
> Subject: Re: [jira] Commented: (LUCENE-565) Supporting  
> deleteDocuments in IndexWriter (Code and Performance Results Provided)
>
> I guess we just chose a much simpler way to do this...
>
> Even with you code changes, to see the modification made using the
> IndexWriter, it must be closed, and a new IndexReader opened.
>
> So a far simpler way is to get the collection of updates first, then
>
> using opened indexreader,
> for each doc in collection
>        delete document using "key"
> endfor
>
> open indexwriter
> for each doc in collection
>        add document
> endfor
>
> open indexreader
>
>
> I don't see how your way is any faster. You must always flush to disk
> and open the indexreader to see the changes.
>
>
>
> On Jul 6, 2006, at 2:07 PM, Ning Li wrote:
>
>> Hi Otis and Robert,
>>
>> I added an overview of my changes in JIRA. Hope that helps.
>>
>>> Anyway, my test did exercise the small batches, in that in our
>>> incremental updates we delete the documents with the unique term,  
>>> and
>>> then add the new (which is what I assumed this was improving), and I
>>> saw o appreciable difference.
>>
>> Robert, could you describe a bit more how your test is set up? Or a
>> short
>> code snippet will help me explain.
>>
>> Without the patch, when inserts and deletes are interleaved in small
>> batches, the performance can degrade dramatically because the
>> ramDirectory
>> is flushed to disk whenever an IndexWriter is closed, causing a  
>> lot of
>> small segments to be created on disk, which eventually need to be
>> merged.
>>
>> Is this how your test is set up? And, what are the maxBufferedDocs
>> and the
>> maxBufferedDeleteTerms in your test? You won't see a performance
>> improvement
>> if they are about the same as the small batch size. The patch  
>> works by
>> internally buffering inserts and deletes into larger batches.
>>
>> Regards,
>> Ning
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message