cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6146) CQL-native stress
Date Mon, 30 Jun 2014 11:48:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047575#comment-14047575
] 

Benedict edited comment on CASSANDRA-6146 at 6/30/14 11:48 AM:
---------------------------------------------------------------

I've pushed a version of these changes [here|https://github.com/belliottsmith/cassandra/tree/6146-cqlstress]

I wanted to integrate the changes a bit more tightly with the old stress, so we didn't seem
to simply have two different stresses only nominally related. At the same time I wanted to
address a few things I felt were important to setup so that future improvements are easy to
introduce:

# We now generate partitions predictably, so when we perform queries we can be sure we're
using data that is relevant to the partition we're operating over
# We explicitly generate multi-row partitions, with configurable distirbution of clustering
components 
# We can support multiple queries / inserts simultaneously in the new path
# The new path is executed with a more standard syntax (it's execute with stress user, instead
of stress write/read; can perform e.g. inserts/queries with "stress user ops(insert=1,query=10)"
for 90/10 read/write workload)
# I've switched configs to all support the range of distributions we could previously (including
for size, etc.)
# All old paths use the same partition generators as the new paths to keep maintenance and
extension simpler
# I've moved a few more config parameters into the yaml
# We report partition and row statistics now

Some other implications:
# To simplify matters and maintenance, I've stripped from the old paths support for super
columns, indexes and multi-gets, as we did not typically seem to exercise these paths and
these are probably best encapsulated with the new ones
# The old path now generates a lot more garbage, because the new path has to, so it will be
slightly higher overhead than it was previously. We also only generate random data on the
old path, so we may again see a decline in performance

Some things still to do in near future; all of which reasonably easy but wanted to limit scope
of refactor:
# Support deletes
# Support partial inserts/deletes (currently insert only supports writing the whole partition)
# Support query result validation

The diff is quite big, but I think a lot of the changes are due to package movements. The
basic functionality of your patch is left intact, so hopefully it shouldn't be too tricky
to figure out what's happening now.


was (Author: benedict):
I've pushed a version of these changes [here|https://github.com/belliottsmith/cassandra/tree/6146-cqlstress]

I wanted to integrate the changes a bit more tightly with the old stress, so we didn't seem
to simply have two different stresses only nominally related. At the same time I wanted to
address a few things I felt were important to setup so that future improvements are easy to
introduce:

# We now generate partitions predictably, so when we perform queries we can be sure we're
using data that is relevant to the partition we're operating over
# We explicitly generate multi-row partitions, with configurable distirbution of clustering
components 
# We can support multiple queries / inserts simultaneously in the new path
# The new path is executed with a more standard syntax (it's execute with stress user, instead
of stress write/read; can perform e.g. inserts/queries with "stress user ops(insert=1,query=10)"
for 90/10 read/write workload)
# I've switched configs to all support the range of distributions we could previously (including
for size, etc.)
# All old paths use the same partition generators as the new paths to keep maintenance and
extension simpler
# I've moved a few more config parameters into the yaml

Some other implications:
# To simplify matters and maintenance, I've stripped from the old paths support for super
columns, indexes and multi-gets, as we did not typically seem to exercise these paths and
these are probably best encapsulated with the new ones
# The old path now generates a lot more garbage, because the new path has to, so it will be
slightly higher overhead than it was previously. We also only generate random data on the
old path, so we may again see a decline in performance

Some things still to do in near future; all of which reasonably easy but wanted to limit scope
of refactor:
# Support deletes
# Support partial inserts/deletes (currently insert only supports writing the whole partition)
# Support query result validation

The diff is quite big, but I think a lot of the changes are due to package movements.

> CQL-native stress
> -----------------
>
>                 Key: CASSANDRA-6146
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6146
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Tools
>            Reporter: Jonathan Ellis
>            Assignee: T Jake Luciani
>             Fix For: 2.1.1
>
>         Attachments: 6146-v2.txt, 6146.txt, 6164-v3.txt
>
>
> The existing CQL "support" in stress is not worth discussing.  We need to start over,
and we might as well kill two birds with one stone and move to the native protocol while we're
at it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message