cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alwyn Davis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12649) Add BATCH metrics
Date Mon, 07 Nov 2016 09:11:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643580#comment-15643580
] 

Alwyn Davis commented on CASSANDRA-12649:
-----------------------------------------

Sure.  I ran cassandra-stress on a separate EC2 instance against a 3-node cluster with trunk
and then with the batch metrics patch.  The results summary is (I've also attached the full
stress output):
TRUNK
--------
LOGGED
Row rate: 13,965 row/s, 95th latency: 1.4ms
Row rate: 14,351 row/s, 95th latency: 1.3ms
Row rate: 14,359 row/s, 95th latency: 1.7ms

UNLOGGED
Row rate: 14,674 row/s, 95th latency: 1.3ms
Row rate: 14,168 row/s, 95th latency: 1.7ms
Row rate: 14,128 row/s, 95th latency: 1.7ms


BATCH METRICS
--------------------
LOGGED
Row rate: 13,531 row/s, 95th latency: 1.9ms
Row rate: 13,943 row/s, 95th latency: 1.7ms
Row rate: 14,210 row/s, 95th latency: 1.7ms

UNLOGGED
Row rate: 14,552 row/s, 95th latency: 1.3ms
Row rate: 14,568 rwo/s, 95th latency: 1.3ms
Row rate: 14,627 row/s, 95th latency: 1.2ms


I also ran the BatchMetrics test class with the metrics patch and just trunk for comparison
(10,000 iterations of single queries, logged, unlogged and CAS batches):
WITH BATCH METRICS
query: 378410 ms, loggedBatch: 31638 ms, unloggedBatch: 21466 ms, cas: 33880 ms
query: 392960 ms, loggedBatch: 26284 ms, unloggedBatch: 21114 ms, cas: 30566 ms
query: 386964 ms, loggedBatch: 28724 ms, unloggedBatch: 22257 ms, cas: 33323 ms

TRUNK
query: 395683 ms, loggedBatch: 29994 ms, unloggedBatch: 21638 ms, cas: 33096 ms
query: 379503 ms, loggedBatch: 29911 ms, unloggedBatch: 21346 ms, cas: 32364 ms
query: 396935 ms, loggedBatch: 42124 ms, unloggedBatch: 21679 ms, cas: 34353 ms


Regarding the usefulness of Logged / Unlogged Partitions per batch, I believe that it would
allow correlation of batch size to performance.  Standard advice is to process a single partition
per unlogged batch, so comparing the number of partitions per batch against throughput should
highlight poor usage of batches.

> Add BATCH metrics
> -----------------
>
>                 Key: CASSANDRA-12649
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12649
>             Project: Cassandra
>          Issue Type: Wish
>            Reporter: Alwyn Davis
>            Priority: Minor
>             Fix For: 3.x
>
>         Attachments: 12649-3.x.patch, stress-batch-metrics.tar.gz, stress-trunk.tar.gz,
trunk-12649.txt
>
>
> To identify causes of load on a cluster, it would be useful to have some additional metrics:
> * *Mutation size distribution:* I believe this would be relevant when tracking the performance
of unlogged batches.
> * *Logged / Unlogged Partitions per batch distribution:* This would also give a count
of batch types processed. Multiple distinct tables in batch would just be considered as separate
partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message