cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-6572) Workload recording / playback
Date Wed, 23 Jul 2014 14:14:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14071759#comment-14071759
] 

Benedict edited comment on CASSANDRA-6572 at 7/23/14 2:13 PM:
--------------------------------------------------------------

It looks to me like you need some way to share the statement preparation across threads, as
it can be used by any thread (and across log segments) once prepared. Probably easiest to
do it during parsing of the log file. 

We also have an issue with replay potentially over-parallelizing, and also potentially OOMing,
as you're submitting straight to a thread pool after parsing each file. So there's nothing
stopping us racing ahead and reading all of the log files (you have an unbounded queue), but
since you submit each file separately you will spawn a thread/executor for each thread/segment
combination, rather than each thread.

Probably we want to create some separate state to represent a thread, which we create once
the first time we see a thread id, insert it into a map, and then place work directly onto
this queue during parsing of all segments. We can submit a runnable immediately for processing
this queue to represent a thread. We have a potential problem here, though, which is that
we do not know if a thread died, so we can fill up the executor pool, so we for now let's
use an unbounded executorpool and leave tackling this properly until we have everything else
in place. We should then limit how many queries ahead we can read to prevent OOM. 

Also, we're still replaying based on _offset_ from last query, which means we will skew very
quickly. We should be fixing an epoch (in nanos) such that you have a log epoch of L, and
queries are run at T=L+X; when re-run we have a replay epoch of R, and we run queries at R+X


was (Author: benedict):
It looks to me like you need some way to share the statement preparation across threads, as
it can be used by any thread (and across log segments) once prepared. Probably easiest to
do it during parsing of the log file. 

We also have an issue with replay potentially over-parallelizing, and also potentially OOMing,
as you're submitting straight to a thread pool after parsing each file. So there's nothing
stopping us racing ahead and reading all of the log files (you have an unbounded queue), but
since you submit each file separately you will spawn a thread/executor for each thread/segment
combination, rather than each thread.

Probably we want to create some separate state to represent a thread, which we create once
the first time we see a thread id, insert it into a map, and then place work directly onto
this queue during parsing of all segments. We can submit a runnable immediately for processing
this queue to represent a thread. We have a potential problem here, though, which is that
we do not know if a thread died, so we can fill up the executor pool, so we for now let's
use an unbounded executorpool and leave tackling this properly until we have everything else
in place. We should then limit how many queries ahead we can read to prevent OOM. 

Also, we're still replaying based on _offset_ from last query, which means we will skew very
quickly. We should be fixing an epoch (in nanos) when writing the log, and on replay, and
we ensure that each query is replayed at a time as close to its offset during replay as it
was offset from the log epoch when first run (i.e. you have a log epoch of L, and queries
are run at T=L+X; when re-run we have a replay epoch of R, and we run queries at R+X)

> Workload recording / playback
> -----------------------------
>
>                 Key: CASSANDRA-6572
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6572
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core, Tools
>            Reporter: Jonathan Ellis
>            Assignee: Lyuben Todorov
>             Fix For: 2.1.1
>
>         Attachments: 6572-trunk.diff
>
>
> "Write sample mode" gets us part way to testing new versions against a real world workload,
but we need an easy way to test the query side as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message