incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: what causes MESSAGE-DESERIALIZER-POOL to spike
Date Wed, 04 Aug 2010 18:46:14 GMT
No, MDP is backing up because Row-Read-Stage [the stage after MDP on
reads] is full at 4096, meaning you're not able to process reads as
quickly as the requests are coming in.

On Wed, Aug 4, 2010 at 2:21 PM, Mike Malone <mike@simplegeo.com> wrote:
> This may be your
> problem: https://issues.apache.org/jira/browse/CASSANDRA-1358
> The message deserializer executor is being created with a core pool size of
> 1. Since it uses a queue with unbounded capacity new requests are always
> queued and the thread pool never grows. So the message deserializer becomes
> a single-threaded bottleneck through which all traffic must pass. So your 16
> cores are reduced to one core for handling all inter-node communication (and
> any intra-node communication that's being passed through the messaging
> service).
> Mike
>
> On Tue, Aug 3, 2010 at 10:02 PM, Dathan Pattishall <dathanvp@gmail.com>
> wrote:
>>
>> The output of htop shows threads as procs with a breakdown of how much cpu
>> /etc per thread (in ncurses color!). All of these Java "procs" are just Java
>> threads of only 1 instance of Cassandra per Server.
>>
>> On Sat, Jul 31, 2010 at 3:45 PM, Benjamin Black <b@b3k.us> wrote:
>>>
>>> Sorry, I just noticed: are you running 14 instances of Cassandra on a
>>> single physical machine or are all those java processes something
>>> else?
>>>
>>> On Mon, Jul 26, 2010 at 12:22 PM, Dathan Pattishall <dathanvp@gmail.com>
>>> wrote:
>>> > I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i7
>>> > cores,
>>> > RAID Disks).
>>> >
>>> > ~# /opt/cassandra/bin/nodetool --host=localhost --port=8181 tpstats
>>> > Pool Name                    Active   Pending     
Completed
>>> > STREAM-STAGE                      0        
0              0
>>> > RESPONSE-STAGE                    0         0        
516280
>>> > ROW-READ-STAGE                    8      4096       
1164326
>>> > LB-OPERATIONS                     0        
0              0
>>> > MESSAGE-DESERIALIZER-POOL         1    682008        1818682
>>> > GMFD                              0        
0           6467
>>> > LB-TARGET                         0        
0              0
>>> > CONSISTENCY-MANAGER               0         0        
661477
>>> > ROW-MUTATION-STAGE                0         0        
998780
>>> > MESSAGE-STREAMING-POOL            0         0             
0
>>> > LOAD-BALANCER-STAGE               0         0             
0
>>> > FLUSH-SORTER-POOL                 0         0             
0
>>> > MEMTABLE-POST-FLUSHER             0         0             
4
>>> > FLUSH-WRITER-POOL                 0         0             
4
>>> > AE-SERVICE-STAGE                  0         0             
0
>>> > HINTED-HANDOFF-POOL               0         0             
3
>>> >
>>> > EQX root@cass04:~# vmstat -n 1
>>> >
>>> > procs -----------memory---------- ---swap-- -----io---- --system--
>>> > -----cpu------
>>> >  r  b   swpd   free   buff  cache   si   so    bi   
bo   in   cs us sy
>>> > id
>>> > wa st
>>> >  6 10   7096 121816  16244 10375492    0    0     1    
3    0    0  5
>>> > 1
>>> > 94  0  0
>>> >  2 10   7096 116484  16248 10381144    0    0  5636     4
21210 9820
>>> > 2  1
>>> > 79 18  0
>>> >  1  9   7096 108920  16248 10387592    0    0  6216    
0 21439 9878
>>> > 2  1
>>> > 81 16  0
>>> >  0  9   7096 129108  16248 10364852    0    0  6024    
0 23280 8753
>>> > 2  1
>>> > 80 17  0
>>> >  2  9   7096 122460  16248 10370908    0    0  6072    
0 20835 9461
>>> > 2  1
>>> > 83 14  0
>>> >  2  8   7096 115740  16260 10375752    0    0  5168   292
21049 9511
>>> > 3  1
>>> > 77 20  0
>>> >  1 10   7096 108424  16260 10382300    0    0  6244     0
21483 8981
>>> > 2  1
>>> > 75 22  0
>>> >  3  8   7096 125028  16260 10364104    0    0  5584    
0 21238 9436
>>> > 2  1
>>> > 81 16  0
>>> >  3  9   7096 117928  16260 10370064    0    0  5988    
0 21505 10225
>>> > 2  1
>>> > 77 19  0
>>> >  1  8   7096 109544  16260 10376640    0    0  6340    28
20840 8602
>>> > 2  1
>>> > 80 18  0
>>> >  0  9   7096 127028  16240 10357652    0    0  5984    
0 20853 9158
>>> > 2  1
>>> > 79 18  0
>>> >  9  0   7096 121472  16240 10363492    0    0  5716    
0 20520 8489
>>> > 1  1
>>> > 82 16  0
>>> >  3  9   7096 112668  16240 10369872    0    0  6404    
0 21314 9459
>>> > 2  1
>>> > 84 13  0
>>> >  1  9   7096 127300  16236 10353440    0    0  5684    
0 38914 10068
>>> > 2  1
>>> > 76 21  0
>>> >
>>> >
>>> > But the 16 cores are hardly utilized. Which indicates to me there is
>>> > some
>>> > bad thread thrashing, but why?
>>> >
>>> >
>>> >
>>> >   1  [|||||                                              
8.3%]
>>> > Tasks:
>>> > 1070 total, 1 running
>>> >   2  [                                                   
0.0%]
>>> > Load
>>> > average: 8.34 9.05 8.82
>>> >   3  [                                                   
0.0%]
>>> > Uptime:
>>> > 192 days(!), 15:29:52
>>> >   4  [|||||||||||                                       
17.9%]
>>> >   5  [|||||                                              
5.7%]
>>> >   6  [||                                                 
1.3%]
>>> >   7  [||                                                 
2.6%]
>>> >   8  [|                                                  
0.6%]
>>> >   9  [|                                                  
0.6%]
>>> >   10 [||                                                 
1.9%]
>>> >   11 [||                                                 
1.9%]
>>> >   12 [||                                                 
1.9%]
>>> >   13 [||                                                 
1.3%]
>>> >   14 [|                                                  
0.6%]
>>> >   15 [||                                                 
1.3%]
>>> >   16 [|                                                  
0.6%]
>>> >   Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028MB]
>>> >   Swp[|                                              
6/1983MB]
>>> >
>>> >   PID USER     PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+ 
Command
>>> > 30269 root      40   0 14100  2116   900 R  4.0  0.0  0:00.49
htop
>>> > 24878 root      40   0 20.6G 8345M 6883M D  3.0 69.4  1:23.03
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24879 root      40   0 20.6G 8345M 6883M D  3.0 69.4  1:22.93
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24874 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:22.73
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24880 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:22.93
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24875 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:23.17
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24658 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:23.06
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24877 root      40   0 20.6G 8345M 6883M S  2.0 69.4  1:23.43
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24873 root      40   0 20.6G 8345M 6883M D  1.0 69.4  1:23.65
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24876 root      40   0 20.6G 8345M 6883M S  1.0 69.4  1:23.62
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24942 root      40   0 20.6G 8345M 6883M S  1.0 69.4  0:23.50
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24943 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:29.53
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24933 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:22.57
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24939 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:12.73
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 25280 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:00.10
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> >
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
Mime
View raw message