lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
Date Mon, 14 Mar 2011 15:04:30 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon Willnauer updated LUCENE-2573:
------------------------------------

    Attachment: LUCENE-2573.patch

next iteration on this patch. I changed some naming issues and separated a ByRAMFlushPolicy
as an abstract base class. This patch contains the original MultiTierFlushPolicy and a SingleTierFlushPolicy
that only has a low and a high watermark. This policy tries to flush the biggest DWPT once
LW is crossed and flushed all DWPT once HW is crossed. 

This patch also adds a "flush if stalled" control that hijacks indexing threads if the DW
is stalled and there are still pending flushes. If so the incoming thread tries to check out
a pending DWPT and flushes it before it adds the actual document.

I didn't benchmark the more complex MultiTierFP vs. SingleTierFP yet. I hope I get to this
soon.

> Tiered flushing of DWPTs by RAM with low/high water marks
> ---------------------------------------------------------
>
>                 Key: LUCENE-2573
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2573
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael Busch
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: Realtime Branch
>
>         Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch,
LUCENE-2573.patch, LUCENE-2573.patch
>
>
> Now that we have DocumentsWriterPerThreads we need to track total consumed RAM across
all DWPTs.
> A flushing strategy idea that was discussed in LUCENE-2324 was to use a tiered approach:
 
> - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM)
> - Flush all DWPTs at a high water mark (e.g. at 110%)
> - Use linear steps in between high and low watermark:  E.g. when 5 DWPTs are used, flush
at 90%, 95%, 100%, 105% and 110%.
> Should we allow the user to configure the low and high water mark values explicitly using
total values (e.g. low water mark at 120MB, high water mark at 140MB)?  Or shall we keep for
simplicity the single setRAMBufferSizeMB() config method and use something like 90% and 110%
for the water marks?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message