incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adeel Akbar <adeel.ak...@panasiangroup.com>
Subject Re: Cassandra 1.1.4 performance issue
Date Wed, 10 Oct 2012 04:41:03 GMT
Dear Aaron,

Thank you so much for your help to resolve this issue. I have found that 
READ Latency increased too much. Please find attached OpsCentre graphs 
for further clarification;




Thanks & Regards

*Adeel**Akbar*

On 10/10/2012 12:26 AM, aaron morton wrote:
>> RF=2
> I would recommend moving the RF 3, the QUOURM for 2 is 2.
>
>> We can't find anything in the cassandra logs indicating that 
>> something's up (such as a slow GC or compaction), and there's no 
>> corresponding traffic spike in the application either
> Does the CPU load correlate with compaction or repair times ?
>
> The node is not waiting on IO and is using all the available CPU, 
> which is a good thing. Have you seen an increase in latency ?
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 8/10/2012, at 10:25 PM, Adeel Akbar <adeel.akbar@panasiangroup.com 
> <mailto:adeel.akbar@panasiangroup.com>> wrote:
>
>> Hi,
>>
>> We're running a small Cassandra cluster (1.1.4) with two nodes and 
>> serving data to our Web and Java application. After up-gradation of 
>> Cassandra from 1.0.8 to 1.1.4, we're starting to see some weird issues.
>>
>> If we run 'ring' command from second node, its show that failed to 
>> connect 7199 of node 1.
>>
>> $ /opt/apache-cassandra-1.1.4/bin/nodetool -h XX.XX.XX.01 ring
>> Failed to connect to 'XX.XX.XX.01:7199': Connection refused
>>
>> We're using Network Monitoring System and Monit to monitor the 
>> servers, and in NMS the average CPU usage is around increased upto 
>> 500%, on our quad-core Xeon servers with 16 GB RAM. But occasionally 
>> through Monit we can see that the 1-min load average goes above 7. Is 
>> this common? Does this happen to everyone else? And why the spikiness 
>> in load? We can't find anything in the cassandra logs indicating that 
>> something's up (such as a slow GC or compaction), and there's no 
>> corresponding traffic spike in the application either. Should we just 
>> add more nodes if any single one gets CPU spikes?
>>
>> Another explanation could also be that we've configured it wrong. 
>> We're running pretty much default config and each node has 16G of RAM.
>>
>> A single keyspace with 15 to 20 column families, RF=2, and we have 
>> 260 GB of actual data. Please find below top and I/O stats for 
>> further reference;
>>
>> top - 14:21:51 up 29 days,  9:52,  1 user,  load average: 6.59, 3.16, 
>> 1.42
>> Tasks: 163 total,   2 running, 161 sleeping,   0 stopped,   0 zombie
>> Cpu0  : 29.0%us,  0.0%sy,  0.0%ni, 71.0%id,  0.0%wa, 0.0%hi,  
>> 0.0%si,  0.0%st
>> Cpu1  : 28.0%us,  0.0%sy,  0.0%ni, 72.0%id,  0.0%wa, 0.0%hi,  
>> 0.0%si,  0.0%st
>> Cpu2  : 13.3%us,  0.0%sy,  0.0%ni, 86.7%id,  0.0%wa, 0.0%hi,  
>> 0.0%si,  0.0%st
>> Cpu3  : 23.5%us,  0.7%sy,  0.0%ni, 75.5%id,  0.0%wa, 0.0%hi,  
>> 0.3%si,  0.0%st
>> Cpu4  : 89.4%us,  0.3%sy,  0.0%ni, 10.0%id,  0.0%wa, 0.0%hi,  
>> 0.3%si,  0.0%st
>> Cpu5  : 29.2%us,  0.0%sy,  0.0%ni, 70.8%id,  0.0%wa, 0.0%hi,  
>> 0.0%si,  0.0%st
>> Cpu6  : 25.1%us,  0.0%sy,  0.0%ni, 74.9%id,  0.0%wa, 0.0%hi,  
>> 0.0%si,  0.0%st
>> Cpu7  : 24.3%us,  0.0%sy,  0.0%ni, 72.0%id,  0.0%wa, 2.3%hi,  
>> 1.3%si,  0.0%st
>> Mem:  16427844k total, 16317416k used,   110428k free, 128824k buffers
>> Swap:        0k total,        0k used,        0k free, 11344696k cached
>>
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM TIME+ COMMAND
>>  5284 root      18   0  265g 7.7g 3.6g S 266.6 49.0 474:24.38 java 
>> -ea -javaagent:/opt/apache-cassandra-1.1.4/bin/../lib/jamm-0.2.5.jar 
>> -XX:+UseThreadPriorities -XX:Thr
>>     1 root      15   0 10368  660  548 S  0.0  0.0 0:01.64 init [3]
>>
>> # iostat -xmn 2 10
>> -x and -n options are mutually exclusive
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            9.77    0.03    0.54    0.98    0.00   88.68
>>
>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s wMB/s avgrq-sz 
>> avgqu-sz   await  svctm  %util
>> sda               0.59     3.97  5.54  0.42     0.20 0.02    
>> 75.52     0.11   19.10   3.55   2.11
>> sda1              0.00     0.00  0.01  0.00     0.00 0.00    
>> 88.69     0.00    1.36   1.31   0.00
>> sda2              0.59     3.97  5.53  0.42     0.20 0.02    
>> 75.51     0.11   19.12   3.55   2.11
>> sdb               1.54     7.82 10.39  0.64     0.28 0.03    
>> 57.77     0.36   32.61   4.27   4.70
>> sdb1              1.54     7.82 10.39  0.64     0.28 0.03    
>> 57.77     0.36   32.61   4.27   4.70
>> dm-0              0.00     0.00  1.73  0.62     0.02 0.00    
>> 19.27     0.02    6.75   0.90   0.21
>> dm-1              0.00     0.00 16.32 12.23     0.46 0.05    
>> 36.47     0.50   17.67   2.07   5.92
>> dm-2              0.00     0.00  0.00  0.00     0.00 0.00     
>> 8.00     0.00    7.10   3.41   0.00
>>
>> Device:                   rMB_nor/s    wMB_nor/s rMB_dir/s    
>> wMB_dir/s    rMB_svr/s    wMB_svr/s ops/s    rops/s    wops/s
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           12.46    0.00    0.00    0.19    0.00   87.35
>>
>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s wMB/s avgrq-sz 
>> avgqu-sz   await  svctm  %util
>> sda               0.00     2.50  0.00  1.00     0.00 0.01    
>> 28.00     0.00    0.00   0.00   0.00
>> sda1              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> sda2              0.00     2.50  0.00  1.00     0.00 0.01    
>> 28.00     0.00    0.00   0.00   0.00
>> sdb               0.00     4.50  0.50  1.50     0.00 0.02    
>> 28.00     0.01    6.00   6.00   1.20
>> sdb1              0.00     4.50  0.50  1.50     0.00 0.02    
>> 28.00     0.01    6.00   6.00   1.20
>> dm-0              0.00     0.00  0.50  4.50     0.00 0.02     
>> 8.80     0.04    8.00   2.40   1.20
>> dm-1              0.00     0.00  0.00  5.00     0.00 0.02     
>> 8.00     0.00    0.00   0.00   0.00
>> dm-2              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>>
>> Device:                   rMB_nor/s    wMB_nor/s rMB_dir/s    
>> wMB_dir/s    rMB_svr/s    wMB_svr/s ops/s    rops/s    wops/s
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>           12.52    0.00    0.00    0.00    0.00   87.48
>>
>> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s wMB/s avgrq-sz 
>> avgqu-sz   await  svctm  %util
>> sda               0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> sda1              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> sda2              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> sdb               0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> sdb1              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> dm-0              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> dm-1              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>> dm-2              0.00     0.00  0.00  0.00     0.00 0.00     
>> 0.00     0.00    0.00   0.00   0.00
>>
>> Device:                   rMB_nor/s    wMB_nor/s rMB_dir/s    
>> wMB_dir/s    rMB_svr/s    wMB_svr/s ops/s    rops/s    wops/s
>>
>> Please help us to improve performance of Cassandra cluster as well as 
>> fix all issues.
>> -- 
>>
>>
>> Thanks & Regards
>>
>> *Adeel**Akbar*
>>
>


Mime
View raw message