cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua McKenzie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9658) Re-enable memory-mapped index file reads on Windows
Date Tue, 30 Jun 2015 19:37:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608934#comment-14608934
] 

Joshua McKenzie commented on CASSANDRA-9658:
--------------------------------------------

Set up a 3-node ccm cluster, disk_access_mode: auto, patched to allow memory-mapping of index
files on Windows. Almost everything worked without issue - the only things that gave any trouble
were:
# Sequential repair
** Throws a file access violation error working w/snapshots
# clearing snapshots
** Didn't reproduce any problems w/them during regular runtime but SystemKeyspaceTest illustrates
the problem where, if we have open mmap'ed readers, snapshot deletion fails.

Ran about 250M records through over the course of the day on 2 different CCM clusters running
constant repairs and major compactions on them, creating and deleting snapshots, creating
and dropping keyspaces and tables.

In the interest of tightening up mmap support on Windows I've put together a patch that does
the following:
# Revert to disk_access_mode param from yaml to determine access modes on Windows, default
to auto if none found
# Flip parallelism to RepairParallelism.PARALLEL in RepairOption and log a warning when on
Windows and either idx or data files are non-standard
# Adds a new FileUtils.deleteRecursiveOnExit, using File.deleteOnExit to defer deletion of
specific files
# Modifies Directories.clearSnapshot to 1st attempt a regular deletion and upon failure, if
on Windows, schedule a deferred deletion on JVM shutdown for the snapshot in question. Logs
a warning that gives the name of the folder and also indicates that users can attempt to manually
delete that folder if they see fit. The upgrade process on Windows should include either a)
a bounce of a node after upgrade if this error appears in the log or b) advice to manually
attempt deletion of those files later. Or both.
# Force SSTableRewriterTest to standard data and idx access mode when on Windows. SSTRW is
incompatible with memory-mapped I/O on Windows in its current incarnation; we'd have to postpone
mapping until rewrite has completed which we can pursue on another ticket. Currently, since
SSTRW is disabled on Windows anyway, I'm ok w/hard-coding the test on the platform to that
disk access mode at this time.
# Updates SystemKeyspaceTest to confirm that the deleteOnExit approach is working as expected
# Cleans up RepairOptionsTest w/respect to the new changes.

Branch available [here|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:9658].
Unit tests pass locally - dtests on Windows are still inconsistent enough that I can't heavily
rely on them to test this change, but I'll probably kick off a dtest run against this branch
since I've gotten it down to 57 errors at this point.

> Re-enable memory-mapped index file reads on Windows
> ---------------------------------------------------
>
>                 Key: CASSANDRA-9658
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9658
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Joshua McKenzie
>            Assignee: Joshua McKenzie
>              Labels: Windows, performance
>             Fix For: 2.2.x
>
>
> It appears that the impact of buffered vs. memory-mapped index file reads has changed
dramatically since last I tested. [Here's some results on various platforms we pulled together
yesterday w/2.2-HEAD|https://docs.google.com/spreadsheets/d/1JaO2x7NsK4SSg_ZBqlfH0AwspGgIgFZ9wZ12fC4VZb0/edit#gid=0].
> TL;DR: On linux we see a 40% hit in performance from 108k ops/sec on reads to 64.8k ops/sec.
While surprising in itself, the really unexpected result (to me) is on Windows - with standard
access we're getting 16.8k ops/second on our bare-metal perf boxes vs. 184.7k ops/sec with
memory-mapped index files, an over 10-fold increase in throughput. While testing w/standard
access, CPU's on the stress machine and C* node are both sitting < 4%, network doesn't
appear bottlenecked, resource monitor doesn't show anything interesting, and performance counters
in the kernel show very little. Changes in thread count simply serve to increase median latency
w/out impacting any other visible metric that we're measuring, so I'm at a loss as to why
the disparity is so huge on the platform.
> The combination of my changes to get the 2.1 branch to behave on Windows along with [~benedict]
and [~Stefania]'s changes in lifecycle and cleanup patterns on 2.2 should hopefully have us
in a state where transitioning back to using memory-mapped I/O on Windows will only cause
trouble on snapshot deletion. Fairly simple runs of stress w/compaction aren't popping up
any obvious errors on file access or renaming - I'm going to do some much heavier testing
(ccm multi-node clusters, long stress w/repair and compaction, etc) and see if there's any
outstanding issues that need to be stamped out to call mmap'ed index files on Windows safe.
The one thing we'll never be able to support is deletion of snapshots while a node is running
and sstables are mapped, but for a > 10x throughput increase I think users would be willing
to make that sacrifice.
> The combination of the powercfg profile change, the kernel timer resolution, and memory-mapped
index files are giving some pretty interesting performance numbers on EC2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message