db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: page replacement [jira] Commented: (DERBY-239) Need a online backup feature that does not block update operations when online backup is in progress.
Date Mon, 25 Jul 2005 15:42:25 GMT
I also agree that page cache enhancement is interesting, but probably
should be tackled as a separate project.  But keeping this goal in mind
while making changes for backup is a good thing.  An interface that
that allows backup to use/reuse a single buffer in the page cache seems
reasonable.  Specializing it would seem to allow some optimizations 
where free page searching could be avoided for this operation which at
a very low level is going to be pushing/pulling pages as fast as possible.

I have seen the following ideas work well in a weight based page cache, 
it tries to limit the overhead of weights by using multiple lru, but 
still have some of the benefit of weight based scheme:
1) have a much smaller range than 0-100, something like 5 where each
    value is it's own lru queue.  This reduces the overhead of searching
    and sorting based on weight.
2) as dan suggests, something like:
    no weight: free list
    0: backup page, linear scan heap pages, read ahead,
    1: probe accessed heap page
    2: leaf page
    3: non-leaf page
    4: root
3) to account for re-reference, pages move up in value when 
re-referenced.  Revalue happens only when page is accessed so
page is already latched, so limits additional overhead needed
to reweigh page.
  various methods can be used for moving down in value:
     o whole queues at a time
     o individual pages in lru order, based on some sort of clock like 
current clock

Øystein Grøvlen wrote:
>>>>>>"DJD" == Daniel John Debrunner <djd@debrunners.com> writes:
>     DJD> I think modifications to the cache would be useful for b), so
>     DJD> that entries in the cache (through generic apis, not specific
>     DJD> to store) could mark how "useful/valuable" they are. Just a
>     DJD> simple scheme, lower numbers less valuable, higher numbers
>     DJD> more valuable, and if it makes it easier to fix a range,
>     DJD> e.g. 0-100, then that would be ok. Then the store could added
>     DJD> pages to the cache with this weighting, e.g. (to get the
>     DJD> general idea)
>     DJD>      pages for backup - weight 0
>     DJD>      overflow column pages - weight 10
>     DJD>      regular pages - weight 20
>     DJD>      leaf index pages - weight 30
>     DJD>       root index pages 80
>     DJD> This weight would then be factored into the decision to throw pages out
>     DJD> or not.
> I agree that we need some mechanism to prevent operations from filling
> the cache with pages that is not likely to be accesssed again in the
> near future.  However, I am afraid that a very detailed "cost-based"
> scheme may create a significant overhead compared to a simple LRU
> scheme.
> One may operate with separate LRU queues for different weights, but I
> guess the number of possible weights will have to be restricted in
> that case.
> I am also not convinced that it is the type of page that is the most
> important criteria for caching.  What matters is access frequency.
> The page type may give a hint, but leaf pages of one index may be more
> frequently accessed than root pages of other indexes.
> The type of access is also a relevant criteria.  Pages accessed
> sequentially is often less likely to be accessed again in the near
> future than pages accessed by direct lookup.  A separate LRU queue for
> sequentially accessed pages may prevent backup and other sequentially
> scans (e.g., select * from t) from throwing out directly accessed
> pages (e.g., index pages and data pages accessed through indexes.)
>     DJD> This project could be independent of the online backup and could have
>     DJD> benfits elsewhere.
> I agree.

View raw message