cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-8987) cassandra-stress should support a more complex client model
Date Wed, 18 Mar 2015 12:28:38 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-8987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict updated CASSANDRA-8987:
--------------------------------
    Description: 
Orthogonal to CASSANDRA-8986, but still very important, is stress' simulation of clients:
currently we assume a fixed number of clients performing infinite synchronous work, whereas,
as I [argued|https://groups.google.com/forum/#!topic/mechanical-sympathy/icNZJejUHfE%5B101-125%5D]
on the mechanical sympathy mailing list, the correct model is to have a new client arrival
distribution and a distinct client model. Ideally, however, I would like to expand this to
support client models that can simulate multi-table "transactions", with both synchronous
and asynchronous steps. So, let's say we have three tables T1, T2, T3, we could say something
like:

A client performs:
* a registration by insert to T1 (and/or perhaps lookup in T1), multiple inserts to T2 and
T2, in parallel
* followed by a number of queries on T3

Probably the best way to achieve this is with a tiered "transaction" definition that can be
composed, so that any single query or insert is a "transaction" that itself may be sequentially
or in parallel composed with any other to compose a new macro transaction. This would then
be combined with a client arrival rate distribution to produce a total cluster workload.

At least one remaining question is if we want the operations to be data dependent, in which
case this may well interact with CASSANDRA-8986, and probably requires a little thought. [~jshook]
[~jeromatron] [~mstump] [~tupshin] [~jlacefie] thoughts on this?

  was:
Orthogonal to CASSANDRA-8986, but still very important, is stress' simulation of clients:
currently we assume a fixed number of clients performing infinite synchronous work, whereas,
as I [argued|https://groups.google.com/forum/#!topic/mechanical-sympathy/icNZJejUHfE%5B101-125%5D]
on the mechanical sympathy mailing list, the correct model is to have a new client arrival
distribution and a distinct client model. Ideally, however, I would like to expand this to
support client models that can simulate multi-table "transactions", with both synchronous
and asynchronous steps. So, let's say we have three tables T1, T2, T3, we could say something
like:

A client performs:
* a registration by insert to T1 (and/or perhaps lookup in T1), multiple inserts to T2 and
T2, in parallel
* followed by a number of queries on T3

Probably the best way to achieve this is with a tiered "transaction" definition that can be
composed, so that any single query or insert is a "transaction" that itself may be sequentially
or in parallel composed with any other to compose a new macro transaction. This would then
be combined with a client arrival rate distribution to produce a total cluster workload.

At least one remaining question is if we want the operations to be data dependent, in which
case this may well interact with CASSANDRA-8986, and probably requires a little thought. [~jshook]
[~jeromatron] [~mstump] [~tupshin] thoughts on this?


> cassandra-stress should support a more complex client model
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-8987
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8987
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tools
>            Reporter: Benedict
>            Assignee: Benedict
>
> Orthogonal to CASSANDRA-8986, but still very important, is stress' simulation of clients:
currently we assume a fixed number of clients performing infinite synchronous work, whereas,
as I [argued|https://groups.google.com/forum/#!topic/mechanical-sympathy/icNZJejUHfE%5B101-125%5D]
on the mechanical sympathy mailing list, the correct model is to have a new client arrival
distribution and a distinct client model. Ideally, however, I would like to expand this to
support client models that can simulate multi-table "transactions", with both synchronous
and asynchronous steps. So, let's say we have three tables T1, T2, T3, we could say something
like:
> A client performs:
> * a registration by insert to T1 (and/or perhaps lookup in T1), multiple inserts to T2
and T2, in parallel
> * followed by a number of queries on T3
> Probably the best way to achieve this is with a tiered "transaction" definition that
can be composed, so that any single query or insert is a "transaction" that itself may be
sequentially or in parallel composed with any other to compose a new macro transaction. This
would then be combined with a client arrival rate distribution to produce a total cluster
workload.
> At least one remaining question is if we want the operations to be data dependent, in
which case this may well interact with CASSANDRA-8986, and probably requires a little thought.
[~jshook] [~jeromatron] [~mstump] [~tupshin] [~jlacefie] thoughts on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message