lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: FlushPolicy and maxBufDelTerm
Date Thu, 01 Aug 2013 13:06:04 GMT
bq. a new segment will be deleted?

I mean a new segment will be flushed :).

Shai


On Thu, Aug 1, 2013 at 4:03 PM, Shai Erera <serera@gmail.com> wrote:

> Hi
>
> I'm a little confused about FlushPolicy and
> IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
> jdocs say:
>
>  * Segments are traditionally flushed by:
>  * <ul>
>  * <li>RAM consumption - configured via
> ...
>  * <li>*Number of buffered delete terms/queries* - configured via
>  * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}</li>
>  * </ul>
>
> Yet IWC.setMaxBufDelTerm says:
>
> NOTE: This setting won't trigger a segment flush.
>
> And FlushByRamOrCountPolicy says:
>
>  * <li>{@link #onDelete(DocumentsWriterFlushControl,
> DocumentsWriterPerThreadPool.ThreadState)} - flushes
>  * based on the global number of buffered delete terms iff
>  * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled</li>
>
> Confused, I wrote a short unit test:
>
>   public void testMaxBufDelTerm() throws Exception {
>     Directory dir = new RAMDirectory();
>     IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
> new MockAnalyzer(random()));
>     conf.setMaxBufferedDeleteTerms(1);
>     conf.setMaxBufferedDocs(10);
>     conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
>     conf.setInfoStream(new PrintStreamInfoStream(System.out));
>     IndexWriter writer = new IndexWriter(dir, conf );
>     int numDocs = 4;
>     for (int i = 0; i < numDocs; i++) {
>       Document doc = new Document();
>       doc.add(new StringField("id", "doc-" + i, Store.NO));
>       writer.addDocument(doc);
>     }
>
>     System.out.println("before delete");
>     for (String f : dir.listAll()) System.out.println(f);
>
>     writer.deleteDocuments(new Term("id", "doc-0"));
>     writer.deleteDocuments(new Term("id", "doc-1"));
>
>     System.out.println("\nafter delete");
>     for (String f : dir.listAll()) System.out.println(f);
>
>     writer.close();
>     dir.close();
>   }
>
> When InfoStream is turned on, I can see messages regarding terms flushing
> (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
> effect.
> Yet both before and after the delete operations, the dir.list() returns
> only the fdx and fdt files.
>
> So is this a bug that a segment isn't flushed? If not (and I'm ok with
> that), is it a documentation inconsistency?
> Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
> size, a new segment will be deleted?
>
> Slightly unrelated to FlushPolicy, but do I understand correctly that
> maxBufDelTerm does not apply to delete-by-query operations?
> BufferedDeletes doesn't increment any counter on addQuery(), so is it
> correct to assume that if I only delete-by-query, this setting has no
> effect?
> And the delete queries are buffered until the next segment is flushed due
> to other operations (constraints, commit, NRT-reopen)?
>
> Shai
>

Mime
View raw message