cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan McGuire (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6916) Preemptive opening of compaction result
Date Tue, 22 Apr 2014 05:02:15 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13976389#comment-13976389
] 

Ryan McGuire edited comment on CASSANDRA-6916 at 4/22/14 5:00 AM:
------------------------------------------------------------------

[~benedict] here's the last test you wanted:

Testplan:
 * Set preheat_kernel_page_cache:true in yaml.
 * Startup C*, create the keyspace and CFs stress would, modifying populate_io_cache_on_flush
to true.
 * Regular stress write (same as other tests above)
 * mixed stress write (same as other tests above)

You had told me offline that those settings were not applicable to your patch, but I went
ahead and tried them anyway, apples to apples. Results with those settings for both stock
2.1 and your patched version:

http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.6916v3-preempive-open-compact.mixed.cache_tweaks.2.json&metric=op_rate&operation=mixed&smoothing=1&xmin=0&xmax=341.55&ymin=0&ymax=97262

I noticed that on your branch I got several timeouts during the mixed operation. Check out
the max latencies metric on that graph.

However, doing the original test you wanted me to do, which is setting populate_io_cache_on_flush:true
and preheat_kernel_page_cache:true ONLY on stock 2.1, comparing to your branch without those
settings, I get:

http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.6916v3-preempive-open-compact.mixed.cache_tweaks.3.json&metric=op_rate&operation=mixed&smoothing=1&xmin=0&xmax=319&ymin=0&ymax=118288.5

Which still shows slightly better that stock. So, it does appear that those settings still
affect something.

New logs:
 * stock C* 2.1 with cache settings on : 6916-stock2_1.mixed.cache_tweaks.tar.gz
 * 6916v3-premptive-open-compact with cache settings on: 6916v3-premptive-open-compact.mixed.cache_tweaks.2.tar.gz

There's some interesting errors in that second set of logs, worth checking out.

Furthermore, I hadn't realized when testing CASSANDRA-6746 that we could actually fare well
with the existing options like this, just that they weren't on by default. It would be great
though if the default settings made this test pass, which your branch does.


was (Author: enigmacurry):
[~benedict] here's the last test you wanted:

Testplan:
 * Set preheat_kernel_page_cache:true in yaml.
 * Startup C*, create the keyspace and CFs stress would, modifying populate_io_cache_on_flush
to true.
 * Regular stress write (same as other tests above)
 * mixed stress write (same as other tests above)

You had told me offline that those settings were not applicable to your patch, but I went
ahead and tried them anyway, apples to apples. Results with those settings for both stock
2.1 and your patched version:

http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.6916v3-preempive-open-compact.mixed.cache_tweaks.2.json&metric=op_rate&operation=mixed&smoothing=1&xmin=0&xmax=341.55&ymin=0&ymax=97262

I noticed that on your branch I got several timeouts during the mixed operation. 

However, doing the original test you wanted me to do, which is setting populate_io_cache_on_flush:true
and preheat_kernel_page_cache:true ONLY on stock 2.1, comparing to your branch without those
settings, I get:

http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.6916v3-preempive-open-compact.mixed.cache_tweaks.3.json&metric=op_rate&operation=mixed&smoothing=1&xmin=0&xmax=319&ymin=0&ymax=118288.5

Which still shows slightly better that stock. So, it does appear that those settings still
affect something.

New logs:
 * stock C* 2.1 with cache settings on : 6916-stock2_1.mixed.cache_tweaks.tar.gz
 * 6916v3-premptive-open-compact with cache settings on: 6916v3-premptive-open-compact.mixed.cache_tweaks.2.tar.gz

There's some interesting errors in that second set of logs, worth checking out.

Furthermore, I hadn't realized when testing CASSANDRA-6746 that we could actually fare well
with the existing options like this, just that they weren't on by default. It would be great
though if the default settings made this test pass, which your branch does.

> Preemptive opening of compaction result
> ---------------------------------------
>
>                 Key: CASSANDRA-6916
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6916
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 2.1
>
>         Attachments: 6916-stock2_1.mixed.cache_tweaks.tar.gz, 6916-stock2_1.mixed.logs.tar.gz,
6916v3-preempive-open-compact.logs.gz, 6916v3-preempive-open-compact.mixed.2.logs.tar.gz,
6916v3-premptive-open-compact.mixed.cache_tweaks.2.tar.gz
>
>
> Related to CASSANDRA-6812, but a little simpler: when compacting, we mess quite badly
with the page cache. One thing we can do to mitigate this problem is to use the sstable we're
writing before we've finished writing it, and to drop the regions from the old sstables from
the page cache as soon as the new sstables have them (even if they're only written to the
page cache). This should minimise any page cache churn, as the old sstables must be larger
than the new sstable, and since both will be in memory, dropping the old sstables is at least
as good as dropping the new.
> The approach is quite straight-forward. Every X MB written:
> # grab flushed length of index file;
> # grab second to last index summary record, after excluding those that point to positions
after the flushed length;
> # open index file, and check that our last record doesn't occur outside of the flushed
length of the data file (pretty unlikely)
> # Open the sstable with the calculated upper bound
> Some complications:
> # must keep running copy of compression metadata for reopening with
> # we need to be able to replace an sstable with itself but a different lower bound
> # we need to drop the old page cache only when readers have finished



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message