cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Shook (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8929) Workload sampling
Date Fri, 06 Mar 2015 21:18:39 GMT


Jonathan Shook commented on CASSANDRA-8929:

Responding to [~jbellis], as we posted in parallel.

Short of having sampling support on the server side, I do not see us getting useful samples.
In all the environments that we operate in, the most reliable tools we have are those that
are built into Cassandra directly. This feature would allow us to stop reinventing the wheel
with users every time we need to understand what their workload is with respect to POCs and
forward planning. I've personally started leaning more and more on settraceprobability for
this, but it comes with its own caveats. To have something that is more tailored around sampling
*just* the statements would save lots of time and energy.

This is the type of feature that, when you need it, there is no substitute. If we could go
into a new environment and make reasonable suggestions for how to configure sampling up front,
we would be able to simply refer back to the data for historic context, changes in workload
patterns, changes in data rates, etc.

The short answer is, No, I don't know of an easier way, given all the trade-offs.

> Workload sampling
> -----------------
>                 Key: CASSANDRA-8929
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
> Workload *recording* looks to be unworkable (CASSANDRA-6572).  We could build something
almost as useful by sampling the requests sent to a node and building a synthetic workload
with the same characteristics using the same (or anonymized) schema.

This message was sent by Atlassian JIRA

View raw message