cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13530) GroupCommitLogService
Date Thu, 14 Sep 2017 16:09:07 GMT


Ariel Weisberg commented on CASSANDRA-13530:

Why are you still testing with commitlog_sync_batch_window_in_ms set to 2? It shouldn't be
set to a low number. It shouldn't be there at all, but it is and the only way to disable it
is to set to something high enough that it doesn't get in the way and cause extra syncs with
smaller than desired batches.

bq. I drew a diagram to help you understand.
Thanks I get the difference between batch and group I just don't understand the result. 

What I don't understand is why the batch size doesn't increase as operations queue up waiting
to get into the next batch. If an operation can't get into the current batch it should get
into the next batch and the size of that batch should naturally increase to whatever is necessary
and then that should repeat. What is going on there that prevents batches from being the proper
size filled with backed up operations? Is it doing a bunch of syncs with just one operation
and that is what makes the performance worse? Can we fix that instead by tracking arrival
rate and adapting? Is it possible for us to get good performance at both low and high concurrency?

I asked to see this not as an average but as a log so I could see the flow of each individual
batch along with timing information so we could answer these questions.

> GroupCommitLogService
> ---------------------
>                 Key: CASSANDRA-13530
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Yuji Ito
>            Assignee: Yuji Ito
>             Fix For: 2.2.x, 3.0.x, 3.11.x
>         Attachments: groupAndBatch.png, groupCommit22.patch, groupCommit30.patch, groupCommit3x.patch,
groupCommitLog_noSerial_result.xlsx, groupCommitLog_result.xlsx,,
> I propose a new CommitLogService, GroupCommitLogService, to improve the throughput when
lots of requests are received.
> It improved the throughput by maximum 94%.
> I'd like to discuss about this CommitLogService.
> Currently, we can select either 2 CommitLog services; Periodic and Batch.
> In Periodic, we might lose some commit log which hasn't written to the disk.
> In Batch, we can write commit log to the disk every time. The size of commit log to write
is too small (< 4KB). When high concurrency, these writes are gathered and persisted to
the disk at once. But, when insufficient concurrency, many small writes are issued and the
performance decreases due to the latency of the disk. Even if you use SSD, processes of many
IO commands decrease the performance.
> GroupCommitLogService writes some commitlog to the disk at once.
> The patch adds GroupCommitLogService (It is enabled by setting `commitlog_sync` and `commitlog_sync_group_window_in_ms`
in cassandra.yaml).
> The difference from Batch is just only waiting for the semaphore.
> By waiting for the semaphore, some writes for commit logs are executed at the same time.
> In GroupCommitLogService, the latency becomes worse if the there is no concurrency.
> I measured the performance with my microbench ( by increasing
the number of threads.The cluster has 3 nodes (Replication factor: 3). Each nodes is AWS EC2
m4.large instance + 200IOPS io1 volume.
> The result is as below. The GroupCommitLogService with 10ms window improved update with
Paxos by 94% and improved select with Paxos by 76%.
> h6. SELECT / sec
> ||\# of threads||Batch 2ms||Group 10ms||
> |1|192|103|
> |2|163|212|
> |4|264|416|
> |8|454|800|
> |16|744|1311|
> |32|1151|1481|
> |64|1767|1844|
> |128|2949|3011|
> |256|4723|5000|
> h6. UPDATE / sec
> ||\# of threads||Batch 2ms||Group 10ms||
> |1|45|26|
> |2|39|51|
> |4|58|102|
> |8|102|198|
> |16|167|213|
> |32|289|295|
> |64|544|548|
> |128|1046|1058|
> |256|2020|2061|

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message