incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Counter read requests spread across replicas ?
Date Thu, 22 Dec 2011 09:20:16 GMT
> I am querying in batches of 256 keys max. Each batch may slice between 1 and 5 explicit
super columns (I need all the columns in each super column, there are at the very most a couple
dozen columns per SC).
That's a pretty high row count, bigger is not always better. 

I just remembered you are using the BOP. Are the rows you are reading all on the same node
? Is the load evenly distributed across the cluster ? it sounds like a single node is getting
overloaded and the others are doing little. 

In your isolated experiment.
> Another experiment : I stopped the process that does all the reading and a little of
the writing. All that's left is a single-threaded process that sending counter updates as
fast as it can in batches of up to 50 mutations.
> First replica : pending counts go up into the low hundreds and back to 0, active up to
3 or 5 and that's a max. Some mutation stage active & pendings => the process is indeed
faster at updating the counters so that doesn't surprise me given that a counter write requires
a read.
> Second & third replicas : no read stage pendings at all. A little RequestResponseStage
as earlier.


What CL are you using ? 
Which thread pool is showing pending ? 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/12/2011, at 11:15 AM, Philippe wrote:

> along the same line of the last experimient I did (cluster is only being updated by a
single threaded batching processing.)
> All nodes are the same hardware & configuration. Why on earth would one node require
disk IO and not the 2 replicas ?
> 
> Primary replica show some disk activity (iostat shows about 40%)
> ----total-cpu-usage---- -dsk/total- 
> usr sys idl wai hiq siq| read  writ
> 67  10  19   2   0   3|4244k  364k|
> 
> where as 2nd & 3rd replica do not
> ----total-cpu-usage---- -dsk/total- 
> usr sys idl wai hiq siq| read  writ
> 42  13  41   0   0   3|   0     0 |
>  47  15  34   0   0   4|4096B  185k
>  49  14  35   0   0   3|   0  8192B
>  47  16  33   0   0   4|   0  4096B
>  44  13  41   0   0   3| 284k  112k
> 
> 3rd
> 11   2  87   1   0   0|   0   136k|
>   0   0  99   0   0   0|   0     0 
>   9   1  90   0   0   0|4096B  128k
>   2   2  96   0   0   0|   0     0 
>   0   0  99   0   0   0|   0     0 
>  11   1  87   0   0   0|   0   128k
> 
> 
> Philippe
> 2011/12/21 Philippe <watcherfr@gmail.com>
> Hi Aaron,
> 
> >How many rows are you asking for in the multget_slice and what thread pools are showing
pending tasks ?
> I am querying in batches of 256 keys max. Each batch may slice between 1 and 5 explicit
super columns (I need all the columns in each super column, there are at the very most a couple
dozen columns per SC).
> 
> On the first replica, only ReadStage ever shows any pending. All the others  have 1 to
10 pending from time to time only. Here's a typical "high pending count" reading on the first
replica for the data hotspot.
> ReadStage                        13      5238    10374301128         0              
  0
> I've got a watch running every two seconds and I see the numbers vary every time going
from that high point to 0 active, 0 pending. The one thing I've noticed is that I hardly every
see the Active count stay up at the current 2s sampling rate. 
> On the 2 other replicas, I hardly ever see any pendings on ReadStage and Active hardly
goes up to 1 or 2. But I do see a little PENDING on RequestResponseStage, goes up in the tens
or hundreds from time to time.
> 
> 
> If I'm flooding that one replica, shouldn't the ReadStage Active count be at maximum
capacity ?
> 
> 
> I've already thought of CASSANDRA-2980 but I'm running 0.8.7 and 0.8.9.
> 
> Also, what happens when you reduce the number of rows in the request?
> I've reduced the requests to batches of 16. I've had to increased the number of threads
from 30 to 90 in order to get the same key throughput because the throughput I measure drastically
goes down on a per thread basis.
> What I see :
>  - CPU utilization is lower on the first replica (why would that be if the batches are
smaller ?)
>  - Pending ReadStage on first replica seems to be staying higher longer. Still goes down
to 0 regularly.
>  - lowering to 60 client threads, I see non-zero active MutationStage and ReplicateOnWriteStage
more often
> For our use-case, the higher the throughput per client thread, the less rework will be
done in our processing.
> 
> Another experiment : I stopped the process that does all the reading and a little of
the writing. All that's left is a single-threaded process that sending counter updates as
fast as it can in batches of up to 50 mutations.
> First replica : pending counts go up into the low hundreds and back to 0, active up to
3 or 5 and that's a max. Some mutation stage active & pendings => the process is indeed
faster at updating the counters so that doesn't surprise me given that a counter write requires
a read.
> Second & third replicas : no read stage pendings at all. A little RequestResponseStage
as earlier.
> 
> Cheers
> Philippe 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 21/12/2011, at 11:57 AM, Philippe wrote:
> 
>> Hello,
>> 5 nodes running 0.8.7/0.8.9, RF=3, BOP, counter columns inside super columns. Read
queries are multigetslices of super columns inside of which I read every column for processing
(20-30 at most), using Hector with default settings.
>> Watching tpstat on the 3 nodes holding the data being most often queries, I see the
pending count increase only on the "main replica" and I see heavy CPU load and network load
only on that node. The other nodes seem to be doing very little.
>> 
>> Aren't counter read requests supposed to be round-robin across replicas ? I'm confused
as to why the nodes don't exhibit the same load.
>> 
>> Thanks
> 
> 
> 


Mime
View raw message