incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Cassandra 1.1.4 performance issue
Date Tue, 09 Oct 2012 19:26:56 GMT
> RF=2
I would recommend moving the RF 3, the QUOURM for 2 is 2. 

> We can't find anything in the cassandra logs indicating that something's up (such as
a slow GC or compaction), and there's no corresponding traffic spike in the application either
Does the CPU load correlate with compaction or repair times ?

The node is not waiting on IO and is using all the available CPU, which is a good thing. Have
you seen an increase in latency ? 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 8/10/2012, at 10:25 PM, Adeel Akbar <adeel.akbar@panasiangroup.com> wrote:

> Hi,
> 
> We're running a small Cassandra cluster (1.1.4) with two nodes and serving data to our
Web and Java application. After up-gradation of Cassandra from 1.0.8 to 1.1.4, we're starting
to see some weird issues. 
> 
> If we run 'ring' command from second node, its show that failed to connect 7199 of node
1. 
> 
> $ /opt/apache-cassandra-1.1.4/bin/nodetool -h XX.XX.XX.01  ring
> Failed to connect to 'XX.XX.XX.01:7199': Connection refused
> 
> We're using Network Monitoring System and Monit to monitor the servers, and in NMS the
average CPU usage is around increased upto 500%, on our quad-core Xeon servers with 16 GB
RAM. But occasionally through Monit we can see that the 1-min load average goes above 7. Is
this common? Does this happen to everyone else? And why the spikiness in load? We can't find
anything in the cassandra logs indicating that something's up (such as a slow GC or compaction),
and there's no corresponding traffic spike in the application either. Should we just add more
nodes if any single one gets CPU spikes?
> 
> Another explanation could also be that we've configured it wrong. We're running pretty
much default config and each node has 16G of RAM.
> 
> A single keyspace with 15 to 20 column families, RF=2, and we have 260 GB of actual data.
Please find below top and I/O stats for further reference;
> 
> top - 14:21:51 up 29 days,  9:52,  1 user,  load average: 6.59, 3.16, 1.42
> Tasks: 163 total,   2 running, 161 sleeping,   0 stopped,   0 zombie
> Cpu0  : 29.0%us,  0.0%sy,  0.0%ni, 71.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu1  : 28.0%us,  0.0%sy,  0.0%ni, 72.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu2  : 13.3%us,  0.0%sy,  0.0%ni, 86.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu3  : 23.5%us,  0.7%sy,  0.0%ni, 75.5%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> Cpu4  : 89.4%us,  0.3%sy,  0.0%ni, 10.0%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> Cpu5  : 29.2%us,  0.0%sy,  0.0%ni, 70.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu6  : 25.1%us,  0.0%sy,  0.0%ni, 74.9%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu7  : 24.3%us,  0.0%sy,  0.0%ni, 72.0%id,  0.0%wa,  2.3%hi,  1.3%si,  0.0%st
> Mem:  16427844k total, 16317416k used,   110428k free,   128824k buffers
> Swap:        0k total,        0k used,        0k free, 11344696k cached
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                
                                                                                         
>  5284 root      18   0  265g 7.7g 3.6g S 266.6 49.0 474:24.38 java -ea -javaagent:/opt/apache-cassandra-1.1.4/bin/../lib/jamm-0.2.5.jar
-XX:+UseThreadPriorities -XX:Thr
>     1 root      15   0 10368  660  548 S  0.0  0.0   0:01.64 init [3]               
                                                                                         
> 
> # iostat -xmn 2 10
> -x and -n options are mutually exclusive
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            9.77    0.03    0.54    0.98    0.00   88.68
> 
> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await
 svctm  %util
> sda               0.59     3.97  5.54  0.42     0.20     0.02    75.52     0.11   19.10
  3.55   2.11
> sda1              0.00     0.00  0.01  0.00     0.00     0.00    88.69     0.00    1.36
  1.31   0.00
> sda2              0.59     3.97  5.53  0.42     0.20     0.02    75.51     0.11   19.12
  3.55   2.11
> sdb               1.54     7.82 10.39  0.64     0.28     0.03    57.77     0.36   32.61
  4.27   4.70
> sdb1              1.54     7.82 10.39  0.64     0.28     0.03    57.77     0.36   32.61
  4.27   4.70
> dm-0              0.00     0.00  1.73  0.62     0.02     0.00    19.27     0.02    6.75
  0.90   0.21
> dm-1              0.00     0.00 16.32 12.23     0.46     0.05    36.47     0.50   17.67
  2.07   5.92
> dm-2              0.00     0.00  0.00  0.00     0.00     0.00     8.00     0.00    7.10
  3.41   0.00
> 
> Device:                   rMB_nor/s    wMB_nor/s    rMB_dir/s    wMB_dir/s    rMB_svr/s
   wMB_svr/s     ops/s    rops/s    wops/s
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           12.46    0.00    0.00    0.19    0.00   87.35
> 
> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await
 svctm  %util
> sda               0.00     2.50  0.00  1.00     0.00     0.01    28.00     0.00    0.00
  0.00   0.00
> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> sda2              0.00     2.50  0.00  1.00     0.00     0.01    28.00     0.00    0.00
  0.00   0.00
> sdb               0.00     4.50  0.50  1.50     0.00     0.02    28.00     0.01    6.00
  6.00   1.20
> sdb1              0.00     4.50  0.50  1.50     0.00     0.02    28.00     0.01    6.00
  6.00   1.20
> dm-0              0.00     0.00  0.50  4.50     0.00     0.02     8.80     0.04    8.00
  2.40   1.20
> dm-1              0.00     0.00  0.00  5.00     0.00     0.02     8.00     0.00    0.00
  0.00   0.00
> dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> 
> Device:                   rMB_nor/s    wMB_nor/s    rMB_dir/s    wMB_dir/s    rMB_svr/s
   wMB_svr/s     ops/s    rops/s    wops/s
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           12.52    0.00    0.00    0.00    0.00   87.48
> 
> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await
 svctm  %util
> sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> sdb1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> dm-1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> dm-2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00
  0.00   0.00
> 
> Device:                   rMB_nor/s    wMB_nor/s    rMB_dir/s    wMB_dir/s    rMB_svr/s
   wMB_svr/s     ops/s    rops/s    wops/s
> 
> Please help us to improve performance of Cassandra cluster as well as fix all issues.

> -- 
> 
> Thanks & Regards
> 
> Adeel Akbar
> 


Mime
View raw message