incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From srmore <comom...@gmail.com>
Subject Re: Write performance with 1.2.12
Date Thu, 12 Dec 2013 02:39:10 GMT
Thanks Aaron


On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton <aaron@thelastpickle.com>wrote:

> Changed memtable_total_space_in_mb to 1024 still no luck.
>
> Reducing memtable_total_space_in_mb will increase the frequency of
> flushing to disk, which will create more for compaction to do and result in
> increased IO.
>
> You should return it to the default.
>

You are right, had to revert it back to default.


>
> when I send traffic to one node its performance is 2x more than when I
> send traffic to all the nodes.
>
>
>
> What are you measuring, request latency or local read/write latency ?
>
> If it’s write latency it’s probably GC, if it’s read is probably IO or
> data model.
>

It is the write latency, read latency is ok. Interestingly the latency is
low when there is one node. When I join other nodes the latency drops about
1/3. To be specific, when I start sending traffic to the other nodes the
latency for all the nodes increases, if I stop traffic to other nodes the
latency drops again, I checked, this is not node specific it happens to any
node.

I don't see any GC activity in logs. Tried to control the compaction by
reducing the number of threads, did not help much.


> Hope that helps.
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 7/12/2013, at 8:05 am, srmore <comomore@gmail.com> wrote:
>
> Changed memtable_total_space_in_mb to 1024 still no luck.
>
>
> On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>
>> Can you set the memtable_total_space_in_mb value, it is defaulting to
>> 1/3 which is 8/3 ~ 2.6 gb in capacity
>>
>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>>
>> The flushing of 2.6 gb to the disk might slow the performance if
>> frequently called, may be you have lots of write operations going on.
>>
>>
>>
>> On Fri, Dec 6, 2013 at 10:06 PM, srmore <comomore@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>>>
>>>> You have passed the JVM configurations and not the cassandra
>>>> configurations which is in cassandra.yaml.
>>>>
>>>
>>> Apologies, was tuning JVM and that's what was in my mind.
>>> Here are the cassandra settings http://pastebin.com/uN42GgYT
>>>
>>>
>>>
>>>> The spikes are not that significant in our case and we are running the
>>>> cluster with 1.7 gb heap.
>>>>
>>>> Are these spikes causing any issue at your end?
>>>>
>>>
>>> There are no big spikes, the overall performance seems to be about 40%
>>> low.
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Dec 6, 2013 at 9:10 PM, srmore <comomore@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky.kak@gmail.com>
wrote:
>>>>>
>>>>>> Hard to say much without knowing about the cassandra configurations.
>>>>>>
>>>>>
>>>>> The cassandra configuration is
>>>>> -Xms8G
>>>>> -Xmx8G
>>>>> -Xmn800m
>>>>> -XX:+UseParNewGC
>>>>> -XX:+UseConcMarkSweepGC
>>>>> -XX:+CMSParallelRemarkEnabled
>>>>> -XX:SurvivorRatio=4
>>>>> -XX:MaxTenuringThreshold=2
>>>>> -XX:CMSInitiatingOccupancyFraction=75
>>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>>>
>>>>>
>>>>>
>>>>>> Yes compactions/GC's could skipe the CPU, I had similar behavior
with
>>>>>> my setup.
>>>>>>
>>>>>
>>>>> Were you able to get around it ?
>>>>>
>>>>>
>>>>>>
>>>>>> -VK
>>>>>>
>>>>>>
>>>>>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comomore@gmail.com>
wrote:
>>>>>>
>>>>>>> We have a 3 node cluster running cassandra 1.2.12, they are pretty
>>>>>>> big machines 64G ram with 16 cores, cassandra heap is 8G.
>>>>>>>
>>>>>>> The interesting observation is that, when I send traffic to one
node
>>>>>>> its performance is 2x more than when I send traffic to all the
nodes. We
>>>>>>> ran 1.0.11 on the same box and we observed a slight dip but not
half as
>>>>>>> seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
>>>>>>> Changing CL to ONE make a slight improvement but not much.
>>>>>>>
>>>>>>> The read_Repair_chance is 0.1. We see some compactions running.
>>>>>>>
>>>>>>> following is my iostat -x output, sda is the ssd (for commit
log)
>>>>>>> and sdb is the spinner.
>>>>>>>
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>           66.46    0.00    8.95    0.01    0.00   24.58
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00    27.60  0.00  4.40     0.00   256.00
>>>>>>> 58.18     0.01    2.55   1.32   0.58
>>>>>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> sda2              0.00    27.60  0.00  4.40     0.00   256.00
>>>>>>> 58.18     0.01    2.55   1.32   0.58
>>>>>>> sdb               0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> sdb1              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> dm-1              0.00     0.00  0.00  0.60     0.00     4.80
>>>>>>> 8.00     0.00    5.33   2.67   0.16
>>>>>>> dm-2              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> dm-3              0.00     0.00  0.00 24.80     0.00   198.40
>>>>>>> 8.00     0.24    9.80   0.13   0.32
>>>>>>> dm-4              0.00     0.00  0.00  6.60     0.00    52.80
>>>>>>> 8.00     0.01    1.36   0.55   0.36
>>>>>>> dm-5              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> dm-6              0.00     0.00  0.00 24.80     0.00   198.40
>>>>>>> 8.00     0.29   11.60   0.13   0.32
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I can see I am cpu bound here but couldn't figure out exactly
what
>>>>>>> is causing it, is this caused by GC or Compaction ? I am thinking
it is
>>>>>>> compaction, I see a lot of context switches and interrupts in
my vmstat
>>>>>>> output.
>>>>>>>
>>>>>>> I don't see GC activity in the logs but see some compaction
>>>>>>> activity. Has anyone seen this ? or know what can be done to
free up the
>>>>>>> CPU.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sandeep
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>

Mime
View raw message