incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Write performance with 1.2.12
Date Thu, 12 Dec 2013 04:49:25 GMT
> It is the write latency, read latency is ok. Interestingly the latency is low when there
is one node. When I join other nodes the latency drops about 1/3. To be specific, when I start
sending traffic to the other nodes the latency for all the nodes increases, if I stop traffic
to other nodes the latency drops again, I checked, this is not node specific it happens to
any node.
Is this the local write latency or the cluster wide write request latency ? 

What sort of numbers are you seeing ? 

Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 12/12/2013, at 3:39 pm, srmore <comomore@gmail.com> wrote:

> Thanks Aaron
> 
> 
> On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
>> Changed memtable_total_space_in_mb to 1024 still no luck.
> 
> Reducing memtable_total_space_in_mb will increase the frequency of flushing to disk,
which will create more for compaction to do and result in increased IO. 
> 
> You should return it to the default.
> 
> You are right, had to revert it back to default.
>  
> 
>> when I send traffic to one node its performance is 2x more than when I send traffic
to all the nodes.
>>  
> What are you measuring, request latency or local read/write latency ? 
> 
> If it’s write latency it’s probably GC, if it’s read is probably IO or data model.

> 
> It is the write latency, read latency is ok. Interestingly the latency is low when there
is one node. When I join other nodes the latency drops about 1/3. To be specific, when I start
sending traffic to the other nodes the latency for all the nodes increases, if I stop traffic
to other nodes the latency drops again, I checked, this is not node specific it happens to
any node.
> 
> I don't see any GC activity in logs. Tried to control the compaction by reducing the
number of threads, did not help much.
> 
> 
> Hope that helps. 
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 7/12/2013, at 8:05 am, srmore <comomore@gmail.com> wrote:
> 
>> Changed memtable_total_space_in_mb to 1024 still no luck.
>> 
>> 
>> On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>> Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3 which is
8/3 ~ 2.6 gb in capacity
>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>> 
>> The flushing of 2.6 gb to the disk might slow the performance if frequently called,
may be you have lots of write operations going on.
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 10:06 PM, srmore <comomore@gmail.com> wrote:
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>> You have passed the JVM configurations and not the cassandra configurations which
is in cassandra.yaml.
>> 
>> Apologies, was tuning JVM and that's what was in my mind. 
>> Here are the cassandra settings http://pastebin.com/uN42GgYT
>> 
>>  
>> The spikes are not that significant in our case and we are running the cluster with
1.7 gb heap.
>> 
>> Are these spikes causing any issue at your end?
>> 
>> There are no big spikes, the overall performance seems to be about 40% low.
>>  
>> 
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 9:10 PM, srmore <comomore@gmail.com> wrote:
>> 
>> 
>> 
>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
>> Hard to say much without knowing about the cassandra configurations.
>>  
>> The cassandra configuration is 
>> -Xms8G
>> -Xmx8G
>> -Xmn800m
>> -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=4
>> -XX:MaxTenuringThreshold=2
>> -XX:CMSInitiatingOccupancyFraction=75
>> -XX:+UseCMSInitiatingOccupancyOnly
>> 
>>  
>> Yes compactions/GC's could skipe the CPU, I had similar behavior with my setup.
>> 
>> Were you able to get around it ?
>>  
>> 
>> -VK
>> 
>> 
>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comomore@gmail.com> wrote:
>> We have a 3 node cluster running cassandra 1.2.12, they are pretty big machines 64G
ram with 16 cores, cassandra heap is 8G. 
>> 
>> The interesting observation is that, when I send traffic to one node its performance
is 2x more than when I send traffic to all the nodes. We ran 1.0.11 on the same box and we
observed a slight dip but not half as seen with 1.2.12. In both the cases we were writing
with LOCAL_QUORUM. Changing CL to ONE make a slight improvement but not much.
>> 
>> The read_Repair_chance is 0.1. We see some compactions running.
>> 
>> following is my iostat -x output, sda is the ssd (for commit log) and sdb is the
spinner.
>> 
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           66.46    0.00    8.95    0.01    0.00   24.58
>> 
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz 
 await  svctm  %util
>> sda               0.00    27.60  0.00  4.40     0.00   256.00    58.18     0.01 
  2.55   1.32   0.58
>> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00 
  0.00   0.00   0.00
>> sda2              0.00    27.60  0.00  4.40     0.00   256.00    58.18     0.01 
  2.55   1.32   0.58
>> sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00 
  0.00   0.00   0.00
>> sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00 
  0.00   0.00   0.00
>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00 
  0.00   0.00   0.00
>> dm-1              0.00     0.00  0.00  0.60     0.00     4.80     8.00     0.00 
  5.33   2.67   0.16
>> dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00 
  0.00   0.00   0.00
>> dm-3              0.00     0.00  0.00 24.80     0.00   198.40     8.00     0.24 
  9.80   0.13   0.32
>> dm-4              0.00     0.00  0.00  6.60     0.00    52.80     8.00     0.01 
  1.36   0.55   0.36
>> dm-5              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00 
  0.00   0.00   0.00
>> dm-6              0.00     0.00  0.00 24.80     0.00   198.40     8.00     0.29 
 11.60   0.13   0.32
>> 
>> 
>> 
>> I can see I am cpu bound here but couldn't figure out exactly what is causing it,
is this caused by GC or Compaction ? I am thinking it is compaction, I see a lot of context
switches and interrupts in my vmstat output.
>> 
>> I don't see GC activity in the logs but see some compaction activity. Has anyone
seen this ? or know what can be done to free up the CPU.
>> 
>> Thanks,
>> Sandeep
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 


Mime
View raw message