cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Terje Marthinussen <>
Subject Re: balance between concurrent_[reads|writes] and feeding/reading threads i clients
Date Fri, 01 Apr 2011 12:42:10 GMT
The reason I am asking is obviously that we saw a bunch of stability issues
for a while.
We had some periods with a lot of dropped messages, but also a bunch of
dead/UP messages without drops (followed by hintedhandoffs) and loads of
read repairs.

This all seems to work a lot better after increasing to 100 read/mutation
threads per node (12 node cluster).
 I am not entirely this is a "just forget" matter. I would much prefer that
cassandra was able to somehow tell the client that "this is too fast! Please
slow down" instead of having an infinitely large queue. Queuing data rarely
get you out of trouble (only gets you into more), so queues should in my
opinion never get longer than what is needed to smoothen out peaks in

It may also sane to make some thread pool allocation or priority handling
for read/writes coming in through internal communication to avoid triggering
hints or read repairs without good reason.

I guess I could play around with throttle limit in the roundrobin schdeduler
for this.


On Tue, Mar 29, 2011 at 7:42 PM, aaron morton <>wrote:

> The concurrent_reads and concurrent_writes set the number of threads in the
> relevant thread pools. You can view the number of active and queued tasks
> using nodetool tpstats.
> The thread pool uses a blocking linked list for it's work queue with a max
> size of Integer.MAX_VALUE. So it's size is essentially unbounded. When
> (internode) messages are received by a node they are queued into the
> relevant thread pool for processing. When (certain) messages are executed it
> checks the send time of the message and will not process it if it is more
> than rpc_timeout (typically 10 seconds) old. This is where the "dropped
> messages" log messages come from.
> The coordinator will wait up to rpc_timeout for the CL number of nodes to
> respond. So if say one node is under severe load and cannot process the read
> in time, but the others are ok a request at QUORUM would probably succeed.
> However if a number of nodes are getting a beating the co-ordinator may time
> out resulting in the client getting a TimedOutException.
> For the read path it's a little more touchy. Only the "nearest" node is
> sent a request for the actual data, the others are asked for a digest of the
> data they would return. So if the "nearest" node is the one under load and
> times out the request will time out even if CL nodes returned. Thats what
> the DynamicSnitch is there for, a node under load would less likely to be
> considered the "nearest" node.
> The read and write thread pools are really just dealing with reading and
> writing data on the local machine. Your request moves through several other
> threads / thread pools: connection thread, outbound TCP pool, inbound TCP
> pool and message response pool. The SEDA paper referenced on this page was
> the model for using thread pools to manage access to resources
> In summary, don't worry about it unless you see the thread pools backing up
> and messages being dropped.
> Hope that helps
> Aaron
> On 28 Mar 2011, at 19:55, Terje Marthinussen wrote:
> Hi,
> I was pondering about how the concurrent_read and write settings balances
> towards max read/write threads in clients.
> Lets say we have 3 nodes, and concurrent read/write set to 8.
> That is, 8*3=24 threads for reading and writing.
> Replication factor is 3.
> Lets say we have clients that in total set up 16 connections to each node.
> Now all the clients write at the same time. Since the replication factor is
> 3, you could get up to 16*3=48  concurrent write request per node (which
> needs to be handled by 8 threads)?
> What is the result if this load continues?
> Could you see that replication of data fails (at least initially) causing
> all kinds of fun timeouts around in the system?
> Same on the read side.
> If all clients read at the same time with Consistency level QUORUM, you get
> 16*2 read requests in best case (and more in worst case)?
> Could you see that one node answers, but another one times out due to lack
> of read threads, causing read repair which again further degrades?
> How does this queue up internally between thrift, gossip and the threads
> doing the actual read and writes?
> Regards,
> Terje

View raw message