From Philippe <>
Subject Re: Counter read requests spread across replicas ?
Date Wed, 21 Dec 2011 21:49:55 GMT
Hi Aaron,

>How many rows are you asking for in the multget_slice and what thread
pools are showing pending tasks ?
I am querying in batches of 256 keys max. Each batch may slice between 1
and 5 explicit super columns (I need all the columns in each super column,
there are at the very most a couple dozen columns per SC).

On the first replica, only ReadStage ever shows any pending. All the others
 have 1 to 10 pending from time to time only. Here's a typical "high
pending count" reading on the first replica for the data hotspot.
ReadStage                        13      5238    10374301128         0
I've got a watch running every two seconds and I see the numbers vary every
time going from that high point to 0 active, 0 pending. The one thing I've
noticed is that I hardly every see the Active count stay up at the current
2s sampling rate.
On the 2 other replicas, I hardly ever see any pendings on ReadStage and
Active hardly goes up to 1 or 2. But I do see a little PENDING
on RequestResponseStage, goes up in the tens or hundreds from time to time.

If I'm flooding that one replica, shouldn't the ReadStage Active count be
at maximum capacity ?

I've already thought of CASSANDRA-2980 but I'm running 0.8.7 and 0.8.9.

Also, what happens when you reduce the number of rows in the request?
I've reduced the requests to batches of 16. I've had to increased the
number of threads from 30 to 90 in order to get the same key throughput
because the throughput I measure drastically goes down on a per thread
What I see :
 - CPU utilization is lower on the first replica (why would that be if the
batches are smaller ?)
 - Pending ReadStage on first replica seems to be staying higher longer.
Still goes down to 0 regularly.
 - lowering to 60 client threads, I see non-zero active MutationStage and
ReplicateOnWriteStage more often
For our use-case, the higher the throughput per client thread, the less
rework will be done in our processing.

Another experiment : I stopped the process that does all the reading and a
little of the writing. All that's left is a single-threaded process that
sending counter updates as fast as it can in batches of up to 50 mutations.
First replica : pending counts go up into the low hundreds and back to 0,
active up to 3 or 5 and that's a max. Some mutation stage active & pendings
=> the process is indeed faster at updating the counters so that doesn't
surprise me given that a counter write requires a read.
Second & third replicas : no read stage pendings at all. A
little RequestResponseStage as earlier.


> Cheers
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> On 21/12/2011, at 11:57 AM, Philippe wrote:
> Hello,
> 5 nodes running 0.8.7/0.8.9, RF=3, BOP, counter columns inside super
> columns. Read queries are multigetslices of super columns inside of which I
> read every column for processing (20-30 at most), using Hector with default
> settings.
> Watching tpstat on the 3 nodes holding the data being most often queries,
> I see the pending count increase only on the "main replica" and I see heavy
> CPU load and network load only on that node. The other nodes seem to be
> doing very little.
> Aren't counter read requests supposed to be round-robin across replicas ?
> I'm confused as to why the nodes don't exhibit the same load.
> Thanks

