incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <watche...@gmail.com>
Subject Re: Counter read requests spread across replicas ?
Date Thu, 22 Dec 2011 10:10:46 GMT
>
> That's a pretty high row count, bigger is not always better.
>
Yes, I've learned that ! However in my case, it is better for the
throughput per thread. It may be that the whole cluster throughput is a
lower but in my case, higher throughput per thread is better


> I just remembered you are using the BOP. Are the rows you are reading all
> on the same node ? Is the load evenly distributed across the cluster ? it
> sounds like a single node is getting overloaded and the others are doing
> little.
>
No, the data being read/write most often is definitaly on a single replica.
That I understand and I know I must rebalance...
My question is why isn't the ReadStage showing similar performance across
all three replicas.


> In your isolated experiment.
>
> Another experiment : I stopped the process that does all the reading and a
>> little of the writing. All that's left is a single-threaded process that
>> sending counter updates as fast as it can in batches of up to 50 mutations.
>> First replica : pending counts go up into the low hundreds and back to 0,
>> active up to 3 or 5 and that's a max. Some mutation stage active & pendings
>> => the process is indeed faster at updating the counters so that doesn't
>> surprise me given that a counter write requires a read.
>> Second & third replicas : no read stage pendings at all. A
>> little RequestResponseStage as earlier.
>>
> What CL are you using ?
>
Always forget that one... using QUORUM


> Which thread pool is showing pending ?
>
ReadStage is the one I'm talking about above when I don't mention the stage
explicitely.

Thanks




>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 22/12/2011, at 11:15 AM, Philippe wrote:
>
> along the same line of the last experimient I did (cluster is only being
> updated by a single threaded batching processing.)
> All nodes are the same hardware & configuration. Why on earth would one
> node require disk IO and not the 2 replicas ?
>
> Primary replica show some disk activity (iostat shows about 40%)
> ----total-cpu-usage---- -dsk/total-
> usr sys idl wai hiq siq| read  writ
> 67  10  19   2   0   3|4244k  364k|
>
> where as 2nd & 3rd replica do not
> ----total-cpu-usage---- -dsk/total-
> usr sys idl wai hiq siq| read  writ
> 42  13  41   0   0   3|   0     0 |
>  47  15  34   0   0   4|4096B  185k
>  49  14  35   0   0   3|   0  8192B
>  47  16  33   0   0   4|   0  4096B
>  44  13  41   0   0   3| 284k  112k
>
> 3rd
> 11   2  87   1   0   0|   0   136k|
>   0   0  99   0   0   0|   0     0
>   9   1  90   0   0   0|4096B  128k
>   2   2  96   0   0   0|   0     0
>   0   0  99   0   0   0|   0     0
>  11   1  87   0   0   0|   0   128k
>
>
> Philippe
> 2011/12/21 Philippe <watcherfr@gmail.com>
>
>> Hi Aaron,
>>
>> >How many rows are you asking for in the multget_slice and what thread
>> pools are showing pending tasks ?
>> I am querying in batches of 256 keys max. Each batch may slice between 1
>> and 5 explicit super columns (I need all the columns in each super column,
>> there are at the very most a couple dozen columns per SC).
>>
>> On the first replica, only ReadStage ever shows any pending. All the
>> others  have 1 to 10 pending from time to time only. Here's a typical "high
>> pending count" reading on the first replica for the data hotspot.
>> ReadStage                        13      5238    10374301128         0
>>               0
>> I've got a watch running every two seconds and I see the numbers vary
>> every time going from that high point to 0 active, 0 pending. The one thing
>> I've noticed is that I hardly every see the Active count stay up at the
>> current 2s sampling rate.
>> On the 2 other replicas, I hardly ever see any pendings on ReadStage and
>> Active hardly goes up to 1 or 2. But I do see a little PENDING
>> on RequestResponseStage, goes up in the tens or hundreds from time to time.
>>
>>
>> If I'm flooding that one replica, shouldn't the ReadStage Active count be
>> at maximum capacity ?
>>
>>
>> I've already thought of CASSANDRA-2980 but I'm running 0.8.7 and 0.8.9.
>>
>> Also, what happens when you reduce the number of rows in the request?
>>>
>> I've reduced the requests to batches of 16. I've had to increased the
>> number of threads from 30 to 90 in order to get the same key throughput
>> because the throughput I measure drastically goes down on a per thread
>> basis.
>> What I see :
>>  - CPU utilization is lower on the first replica (why would that be if
>> the batches are smaller ?)
>>  - Pending ReadStage on first replica seems to be staying higher longer.
>> Still goes down to 0 regularly.
>>  - lowering to 60 client threads, I see non-zero active MutationStage and
>> ReplicateOnWriteStage more often
>> For our use-case, the higher the throughput per client thread, the less
>> rework will be done in our processing.
>>
>> Another experiment : I stopped the process that does all the reading and
>> a little of the writing. All that's left is a single-threaded process that
>> sending counter updates as fast as it can in batches of up to 50 mutations.
>> First replica : pending counts go up into the low hundreds and back to 0,
>> active up to 3 or 5 and that's a max. Some mutation stage active & pendings
>> => the process is indeed faster at updating the counters so that doesn't
>> surprise me given that a counter write requires a read.
>> Second & third replicas : no read stage pendings at all. A
>> little RequestResponseStage as earlier.
>>
>> Cheers
>> Philippe
>>
>>>
>>> Cheers
>>>
>>>   -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 21/12/2011, at 11:57 AM, Philippe wrote:
>>>
>>> Hello,
>>> 5 nodes running 0.8.7/0.8.9, RF=3, BOP, counter columns inside super
>>> columns. Read queries are multigetslices of super columns inside of which I
>>> read every column for processing (20-30 at most), using Hector with default
>>> settings.
>>> Watching tpstat on the 3 nodes holding the data being most often
>>> queries, I see the pending count increase only on the "main replica" and I
>>> see heavy CPU load and network load only on that node. The other nodes seem
>>> to be doing very little.
>>>
>>> Aren't counter read requests supposed to be round-robin across replicas
>>> ? I'm confused as to why the nodes don't exhibit the same load.
>>>
>>> Thanks
>>>
>>>
>>>
>>
>
>

Mime
View raw message