cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-6199) Improve Stress Tool
Date Mon, 28 Oct 2013 21:10:30 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict updated CASSANDRA-6199:
--------------------------------

    Attachment: old.write.latency.svg
                old.read.latency.svg
                new.write.latency.svg
                new.read.latency.svg

Some more graphs to demonstrate the improved latency reporting in the new stress tool - the
old stress reports only a running latency with an expontential fall off for old results, but
the decay configured means that most runs won't actually see it kick in very effectively.

The new stress reports the actual latency bands for any measured interval, and accurately
combines these into an overall latency for the entire run with equal probability for all measurements.

> Improve Stress Tool
> -------------------
>
>                 Key: CASSANDRA-6199
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6199
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Minor
>         Attachments: new.read.latency.svg, new.read.rate.distribution.svg, new.write.latency.svg,
new.write.rate.distribution.svg, old.read.latency.svg, old.read.rate.distribution.svg, old.write.latency.svg,
old.write.rate.distribution.svg, ops.read.svg, ops.write.svg
>
>
> The stress tool could do with sprucing up. The following is a list of essential improvements
and things that would be nice to have.
> Essential:
> - Reduce variability of results, especially start/end tails. Do not trash first/last
10% of readings
> - Reduce contention/overhead in stress to increase overall throughput
> - Short warm-up period, which is ignored for summary (or summarised separately), though
prints progress as usual. Potentially automatic detection of rate levelling.
> - Better configurability and defaults for data generation - current column generation
populates columns with the same value for every row, which is very easily compressible. Possibly
introduce partial random data generator (possibly dictionary-based random data generator)
> Nice to have:
> - Calculate and print stdev and mean
> - Add batched sequential access mode (where a single thread performs batch-size sequential
requests before selecting another random key) to test how key proximity affects performance
> - Auto-mode which attempts to establish the maximum throughput rate, by varying the thread
count (or otherwise gating the number of parallel requests) for some period, then configures
rate limit or thread count to test performance at e.g. 30%, 50%, 70%, 90%, 120%, 150% and
unconstrained.
> - Auto-mode could have a target variance ratio for mean throughput and/or latency, and
completes a test once this target is hit for x intervals
> - Fix key representation so independent of number of keys (possibly switch to 10 digit
hex), and don't use String.format().getBytes() to construct it (expensive)
> Also, remove the skip-key setting, as it is currently ignored. Unless somebody knows
the reason for it.
> - Fix latency stats
> - Read/write mode, with configurable recency-of-reads distribution
> - Add new exponential/extreme value distribution for value size, column count and recency-of-reads
> - Support more than 2^31 keys
> - Supports multiple concurrent stress inserts via key-offset parameter or similar



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message