The reason I am asking is obviously that we saw a bunch of stability issues for a while.
The concurrent_reads and concurrent_writes set the number of threads in the relevant thread pools. You can view the number of active and queued tasks using nodetool tpstats.The thread pool uses a blocking linked list for it's work queue with a max size of Integer.MAX_VALUE. So it's size is essentially unbounded. When (internode) messages are received by a node they are queued into the relevant thread pool for processing. When (certain) messages are executed it checks the send time of the message and will not process it if it is more than rpc_timeout (typically 10 seconds) old. This is where the "dropped messages" log messages come from.The coordinator will wait up to rpc_timeout for the CL number of nodes to respond. So if say one node is under severe load and cannot process the read in time, but the others are ok a request at QUORUM would probably succeed. However if a number of nodes are getting a beating the co-ordinator may time out resulting in the client getting a TimedOutException.For the read path it's a little more touchy. Only the "nearest" node is sent a request for the actual data, the others are asked for a digest of the data they would return. So if the "nearest" node is the one under load and times out the request will time out even if CL nodes returned. Thats what the DynamicSnitch is there for, a node under load would less likely to be considered the "nearest" node.The read and write thread pools are really just dealing with reading and writing data on the local machine. Your request moves through several other threads / thread pools: connection thread, outbound TCP pool, inbound TCP pool and message response pool. The SEDA paper referenced on this page was the model for using thread pools to manage access to resources http://wiki.apache.org/cassandra/ArchitectureInternalsIn summary, don't worry about it unless you see the thread pools backing up and messages being dropped.Hope that helpsAaron
On 28 Mar 2011, at 19:55, Terje Marthinussen wrote:Hi,I was pondering about how the concurrent_read and write settings balances towards max read/write threads in clients.Lets say we have 3 nodes, and concurrent read/write set to 8.That is, 8*3=24 threads for reading and writing.Replication factor is 3.Lets say we have clients that in total set up 16 connections to each node.Now all the clients write at the same time. Since the replication factor is 3, you could get up to 16*3=48 concurrent write request per node (which needs to be handled by 8 threads)?What is the result if this load continues?Could you see that replication of data fails (at least initially) causing all kinds of fun timeouts around in the system?Same on the read side.If all clients read at the same time with Consistency level QUORUM, you get 16*2 read requests in best case (and more in worst case)?Could you see that one node answers, but another one times out due to lack of read threads, causing read repair which again further degrades?How does this queue up internally between thrift, gossip and the threads doing the actual read and writes?Regards,Terje