> I think the doc is correct

Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting?
Or perhaps they should be clarified that the deletes are flushed == applied on existing segments?

I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet.
All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms?

Shai


On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless <lucene@mikemccandless.com> wrote:
First off, it's bad that you don't see .del files when
conf.setMaxBufferedDeleteTerms is 1.

But, it could be that newIndexWriterConfig turned on readerPooling
which would mean the deletes are held in the SegmentReader and not
flushed to disk.  Can you make sure that's off?

Second off, I think the doc is correct: a segment will not be flushed;
rather, new .del files should appear against older segments.

And yes, if RAM usage of the buffered del Term/Query s is too high,
then a segment is flushed along with the deletes being applied
(creating the .del files).

I think buffered delete Querys are not counted towards
setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
(rough rough estimate) or by other ops (merging, NRT reopen, commit,
etc.).

Mike McCandless

http://blog.mikemccandless.com


On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera <serera@gmail.com> wrote:
> Hi
>
> I'm a little confused about FlushPolicy and
> IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs
> say:
>
>  * Segments are traditionally flushed by:
>  * <ul>
>  * <li>RAM consumption - configured via
> ...
>  * <li>Number of buffered delete terms/queries - configured via
>  * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}</li>
>  * </ul>
>
> Yet IWC.setMaxBufDelTerm says:
>
> NOTE: This setting won't trigger a segment flush.
>
> And FlushByRamOrCountPolicy says:
>
>  * <li>{@link #onDelete(DocumentsWriterFlushControl,
> DocumentsWriterPerThreadPool.ThreadState)} - flushes
>  * based on the global number of buffered delete terms iff
>  * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled</li>
>
> Confused, I wrote a short unit test:
>
>   public void testMaxBufDelTerm() throws Exception {
>     Directory dir = new RAMDirectory();
>     IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new
> MockAnalyzer(random()));
>     conf.setMaxBufferedDeleteTerms(1);
>     conf.setMaxBufferedDocs(10);
>     conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
>     conf.setInfoStream(new PrintStreamInfoStream(System.out));
>     IndexWriter writer = new IndexWriter(dir, conf );
>     int numDocs = 4;
>     for (int i = 0; i < numDocs; i++) {
>       Document doc = new Document();
>       doc.add(new StringField("id", "doc-" + i, Store.NO));
>       writer.addDocument(doc);
>     }
>
>     System.out.println("before delete");
>     for (String f : dir.listAll()) System.out.println(f);
>
>     writer.deleteDocuments(new Term("id", "doc-0"));
>     writer.deleteDocuments(new Term("id", "doc-1"));
>
>     System.out.println("\nafter delete");
>     for (String f : dir.listAll()) System.out.println(f);
>
>     writer.close();
>     dir.close();
>   }
>
> When InfoStream is turned on, I can see messages regarding terms flushing
> (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
> effect.
> Yet both before and after the delete operations, the dir.list() returns only
> the fdx and fdt files.
>
> So is this a bug that a segment isn't flushed? If not (and I'm ok with
> that), is it a documentation inconsistency?
> Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
> size, a new segment will be deleted?
>
> Slightly unrelated to FlushPolicy, but do I understand correctly that
> maxBufDelTerm does not apply to delete-by-query operations?
> BufferedDeletes doesn't increment any counter on addQuery(), so is it
> correct to assume that if I only delete-by-query, this setting has no
> effect?
> And the delete queries are buffered until the next segment is flushed due to
> other operations (constraints, commit, NRT-reopen)?
>
> Shai

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org