cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8987) cassandra-stress should support a more complex client model
Date Wed, 18 Mar 2015 12:25:38 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367055#comment-14367055
] 

Benedict commented on CASSANDRA-8987:
-------------------------------------

This is particularly problematic for latency modelling, although we can improve that with
a simpler model, and in doing so help users modelling their workloads

> cassandra-stress should support a more complex client model
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-8987
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8987
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>
> Orthogonal to CASSANDRA-8986, but still very important, is stress' simulation of clients:
currently we assume a fixed number of clients performing infinite synchronous work, whereas,
as I [argued|https://groups.google.com/forum/#!topic/mechanical-sympathy/icNZJejUHfE%5B101-125%5D]
on the mechanical sympathy mailing list, the correct model is to have a new client arrival
distribution and a distinct client model. Ideally, however, I would like to expand this to
support client models that can simulate multi-table "transactions", with both synchronous
and asynchronous steps. So, let's say we have three tables T1, T2, T3, we could say something
like:
> A client performs:
> * a registration by insert to T1 (and/or perhaps lookup in T1), multiple inserts to T2
and T2, in parallel
> * followed by a number of queries on T3
> Probably the best way to achieve this is with a tiered "transaction" definition that
can be composed, so that any single query or insert is a "transaction" that itself may be
sequentially or in parallel composed with any other to compose a new macro transaction. This
would then be combined with a client arrival rate distribution to produce a total cluster
workload.
> At least one remaining question is if we want the operations to be data dependent, in
which case this may well interact with CASSANDRA-8986, and probably requires a little thought.
[~jshook] [~jeromatron] [~mstump] [~tupshin] thoughts on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message