cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction
Date Fri, 18 Feb 2011 01:47:12 GMT


T Jake Luciani commented on CASSANDRA-1902:

bq. After the file is done being written, we call getCachedPages() across all sstables used
in compaction and compute which pages are hot AFTER compaction is complete. This would allow
us to to then sweep through the new SSTable written and mark pages that were hot. If we do
the process while file is being written and we have a compaction that might take an hour,
by the time it's done, the cache could churn.

I like it in theory, I guess the only thing is it's less efficient since you need to re-iterate
through all old sstables twice, once for compaction and once for matching the cached pages
to the rows.  Then you'd need to iterate through the new sstable to find the new rows location.

For something like compaction where we are trying to minimize IO it might be not worth it?

> Migrate cached pages during compaction 
> ---------------------------------------
>                 Key: CASSANDRA-1902
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>             Fix For: 0.7.3
>         Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>   Original Estimate: 32h
>          Time Spent: 24h
>  Remaining Estimate: 8h
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted
CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that uses the posix
mincore() function to detect the offsets of pages for this file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the keys actually
in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os cache and
make sure the subsequent pages in the new compacted SSTable are kept in the page cache for
these keys. This will minimize the impact of compacting a "hot" SSTable.
> A simpler yet similar approach is described here:

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message