cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Kołaczkowski (JIRA) <>
Subject [jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
Date Mon, 29 Dec 2014 09:28:14 GMT


Piotr Kołaczkowski commented on CASSANDRA-7937:

I like the idea of using parallelism level as a backpressure mechanism. That would have a
nice positive effect of automatically reducing the amount of memory used for queuing the requests.

However, my biggest concern is, that even limiting a single client to one write at a time
(window size = 1), might still be too fast, for some fast clients, if only row sizes are big
enough, particularly when writing big cells of data, where big = hundreds of kB / single MBs
per cell. Cassandra is extremely efficient at ingesting data into memtables. If it was faster
than we're able to flush, then we still have a problem. 

So I guess if, after going down to parallelism level = 1 and still being too fast (e.g. flush
queue full, last memtable almost full), we could tell the client a message "please do not
send data faster than X MB/s now" and the client (driver) could do some artificial delay before
processing the next request.

> Apply backpressure gently when overloaded with writes
> -----------------------------------------------------
>                 Key: CASSANDRA-7937
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra 2.0
>            Reporter: Piotr Kołaczkowski
>              Labels: performance
> When writing huge amounts of data into C* cluster from analytic tools like Hadoop or
Apache Spark, we can see that often C* can't keep up with the load. This is because analytic
tools typically write data "as fast as they can" in parallel, from many nodes and they are
not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of
nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark
nodes (writers) and although possible write performance is higher, the problem still remains.
> We observe the following behavior:
> 1. data is ingested at an extreme fast pace into memtables and flush queue fills up
> 2. the available memory limit for memtables is reached and writes are no longer accepted
> 3. the application gets hit by "write timeout", and retries repeatedly, in vain 
> 4. after several failed attempts to write, the job gets aborted 
> Desired behaviour:
> 1. data is ingested at an extreme fast pace into memtables and flush queue fills up
> 2. after exceeding some memtable "fill threshold", C* applies adaptive rate limiting
to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes
still occur within the write timeout.
> 3. thanks to slowed down data ingestion, now flush can finish before all the memory gets
> Of course the details how rate limiting could be done are up for a discussion.
> It may be also worth considering putting such logic into the driver, not C* core, but
then C* needs to expose at least the following information to the driver, so we could calculate
the desired maximum data rate:
> 1. current amount of memory available for writes before they would completely block
> 2. total amount of data queued to be flushed and flush progress (amount of data to flush
remaining for the memtable currently being flushed)
> 3. average flush write speed

This message was sent by Atlassian JIRA

View raw message