incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Molinaro <antho...@alumni.caltech.edu>
Subject Re: what causes MESSAGE-DESERIALIZER-POOL to spike
Date Mon, 26 Jul 2010 19:32:22 GMT
It's usually I/O which causes backup in MESSAGE-DESERIALIZER-POOL.  You
should check iostat and see what it looks like.  It may be that you
need more nodes in order to deal with the read/write rate.   You can also
use JMX to get latency values on reads and writes and see if the backup
has a corresponding increase in latency.  You may be able to get more
out of your hardware and memory with row caching but that really depends
on your data set.

-Anthony

On Mon, Jul 26, 2010 at 12:22:46PM -0700, Dathan Pattishall wrote:
> I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i7 cores,
> RAID Disks).
> 
> ~# /opt/cassandra/bin/nodetool --host=localhost --port=8181 tpstats
> Pool Name                    Active   Pending      Completed
> STREAM-STAGE                      0         0              0
> RESPONSE-STAGE                    0         0         516280
> ROW-READ-STAGE                    8      4096        1164326
> LB-OPERATIONS                     0         0              0
> *MESSAGE-DESERIALIZER-POOL         1    682008        1818682*
> GMFD                              0         0           6467
> LB-TARGET                         0         0              0
> CONSISTENCY-MANAGER               0         0         661477
> ROW-MUTATION-STAGE                0         0         998780
> MESSAGE-STREAMING-POOL            0         0              0
> LOAD-BALANCER-STAGE               0         0              0
> FLUSH-SORTER-POOL                 0         0              0
> MEMTABLE-POST-FLUSHER             0         0              4
> FLUSH-WRITER-POOL                 0         0              4
> AE-SERVICE-STAGE                  0         0              0
> HINTED-HANDOFF-POOL               0         0              3
> 
> EQX root@cass04:~# vmstat -n 1
> 
> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> wa st
>  6 10   7096 121816  16244 10375492    0    0     1     3    0    0  5  1
> 94  0  0
>  2 10   7096 116484  16248 10381144    0    0  5636     4 21210 9820  2  1
> 79 18  0
>  1  9   7096 108920  16248 10387592    0    0  6216     0 21439 9878  2  1
> 81 16  0
>  0  9   7096 129108  16248 10364852    0    0  6024     0 23280 8753  2  1
> 80 17  0
>  2  9   7096 122460  16248 10370908    0    0  6072     0 20835 9461  2  1
> 83 14  0
>  2  8   7096 115740  16260 10375752    0    0  5168   292 21049 9511  3  1
> 77 20  0
>  1 10   7096 108424  16260 10382300    0    0  6244     0 21483 8981  2  1
> 75 22  0
>  3  8   7096 125028  16260 10364104    0    0  5584     0 21238 9436  2  1
> 81 16  0
>  3  9   7096 117928  16260 10370064    0    0  5988     0 21505 10225  2  1
> 77 19  0
>  1  8   7096 109544  16260 10376640    0    0  6340    28 20840 8602  2  1
> 80 18  0
>  0  9   7096 127028  16240 10357652    0    0  5984     0 20853 9158  2  1
> 79 18  0
>  9  0   7096 121472  16240 10363492    0    0  5716     0 20520 8489  1  1
> 82 16  0
>  3  9   7096 112668  16240 10369872    0    0  6404     0 21314 9459  2  1
> 84 13  0
>  1  9   7096 127300  16236 10353440    0    0  5684     0 38914 10068  2  1
> 76 21  0
> 
> 
> *But the 16 cores are hardly utilized. Which indicates to me there is some
> bad thread thrashing, but why? *
> 
> 
> 
>   1  [|||||                                               8.3%]     Tasks:
> 1070 total, 1 running
>   2  [                                                    0.0%]     Load
> average: 8.34 9.05 8.82
>   3  [                                                    0.0%]     Uptime:
> 192 days(!), 15:29:52
>   4  [|||||||||||                                        17.9%]
>   5  [|||||                                               5.7%]
>   6  [||                                                  1.3%]
>   7  [||                                                  2.6%]
>   8  [|                                                   0.6%]
>   9  [|                                                   0.6%]
>   10 [||                                                  1.9%]
>   11 [||                                                  1.9%]
>   12 [||                                                  1.9%]
>   13 [||                                                  1.3%]
>   14 [|                                                   0.6%]
>   15 [||                                                  1.3%]
>   16 [|                                                   0.6%]
>   Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028MB]
>   Swp[|                                               6/1983MB]
> 
>   PID USER     PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
> 30269 root      40   0 14100  2116   900 R  4.0  0.0  0:00.49 htop
> 24878 root      40   0 20.6G 8345M 6883M D  3.0 69.4  1:23.03
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24879 root      40   0 20.6G 8345M 6883M D  3.0 69.4  1:22.93
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24874 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:22.73
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24880 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:22.93
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24875 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:23.17
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24658 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:23.06
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24877 root      40   0 20.6G 8345M 6883M S  2.0 69.4  1:23.43
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24873 root      40   0 20.6G 8345M 6883M D  1.0 69.4  1:23.65
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24876 root      40   0 20.6G 8345M 6883M S  1.0 69.4  1:23.62
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24942 root      40   0 20.6G 8345M 6883M S  1.0 69.4  0:23.50
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24943 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:29.53
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24933 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:22.57
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 24939 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:12.73
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
> 25280 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:00.10
> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym@alumni.caltech.edu>

Mime
View raw message