So after 4096 messages get pushed on the row-read-stage queue (or any other multiThreadedStage) the deserializer basically becomes a single-threaded blocking queue that prevents any other inter-node RPC from occurring..? Sounds like it's a problem either way. If the row read stage is what's backed up, why not have the messages stack up on that stage?
No, MDP is backing up because Row-Read-Stage [the stage after MDP on
reads] is full at 4096, meaning you're not able to process reads as
quickly as the requests are coming in.
--On Wed, Aug 4, 2010 at 2:21 PM, Mike Malone <mike@simplegeo.com> wrote:
> This may be your
> problem: https://issues.apache.org/jira/browse/CASSANDRA-1358
> The message deserializer executor is being created with a core pool size of
> 1. Since it uses a queue with unbounded capacity new requests are always
> queued and the thread pool never grows. So the message deserializer becomes
> a single-threaded bottleneck through which all traffic must pass. So your 16
> cores are reduced to one core for handling all inter-node communication (and
> any intra-node communication that's being passed through the messaging
> service).
> Mike
>
> On Tue, Aug 3, 2010 at 10:02 PM, Dathan Pattishall <dathanvp@gmail.com>
> wrote:
>>
>> The output of htop shows threads as procs with a breakdown of how much cpu
>> /etc per thread (in ncurses color!). All of these Java "procs" are just Java
>> threads of only 1 instance of Cassandra per Server.
>>
>> On Sat, Jul 31, 2010 at 3:45 PM, Benjamin Black <b@b3k.us> wrote:
>>>
>>> Sorry, I just noticed: are you running 14 instances of Cassandra on a
>>> single physical machine or are all those java processes something
>>> else?
>>>
>>> On Mon, Jul 26, 2010 at 12:22 PM, Dathan Pattishall <dathanvp@gmail.com>
>>> wrote:
>>> > I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i7
>>> > cores,
>>> > RAID Disks).
>>> >
>>> > ~# /opt/cassandra/bin/nodetool --host=localhost --port=8181 tpstats
>>> > Pool Name Active Pending Completed
>>> > STREAM-STAGE 0 0 0
>>> > RESPONSE-STAGE 0 0 516280
>>> > ROW-READ-STAGE 8 4096 1164326
>>> > LB-OPERATIONS 0 0 0
>>> > MESSAGE-DESERIALIZER-POOL 1 682008 1818682
>>> > GMFD 0 0 6467
>>> > LB-TARGET 0 0 0
>>> > CONSISTENCY-MANAGER 0 0 661477
>>> > ROW-MUTATION-STAGE 0 0 998780
>>> > MESSAGE-STREAMING-POOL 0 0 0
>>> > LOAD-BALANCER-STAGE 0 0 0
>>> > FLUSH-SORTER-POOL 0 0 0
>>> > MEMTABLE-POST-FLUSHER 0 0 4
>>> > FLUSH-WRITER-POOL 0 0 4
>>> > AE-SERVICE-STAGE 0 0 0
>>> > HINTED-HANDOFF-POOL 0 0 3
>>> >
>>> > EQX root@cass04:~# vmstat -n 1
>>> >
>>> > procs -----------memory---------- ---swap-- -----io---- --system--
>>> > -----cpu------
>>> > r b swpd free buff cache si so bi bo in cs us sy
>>> > id
>>> > wa st
>>> > 6 10 7096 121816 16244 10375492 0 0 1 3 0 0 5
>>> > 1
>>> > 94 0 0
>>> > 2 10 7096 116484 16248 10381144 0 0 5636 4 21210 9820
>>> > 2 1
>>> > 79 18 0
>>> > 1 9 7096 108920 16248 10387592 0 0 6216 0 21439 9878
>>> > 2 1
>>> > 81 16 0
>>> > 0 9 7096 129108 16248 10364852 0 0 6024 0 23280 8753
>>> > 2 1
>>> > 80 17 0
>>> > 2 9 7096 122460 16248 10370908 0 0 6072 0 20835 9461
>>> > 2 1
>>> > 83 14 0
>>> > 2 8 7096 115740 16260 10375752 0 0 5168 292 21049 9511
>>> > 3 1
>>> > 77 20 0
>>> > 1 10 7096 108424 16260 10382300 0 0 6244 0 21483 8981
>>> > 2 1
>>> > 75 22 0
>>> > 3 8 7096 125028 16260 10364104 0 0 5584 0 21238 9436
>>> > 2 1
>>> > 81 16 0
>>> > 3 9 7096 117928 16260 10370064 0 0 5988 0 21505 10225
>>> > 2 1
>>> > 77 19 0
>>> > 1 8 7096 109544 16260 10376640 0 0 6340 28 20840 8602
>>> > 2 1
>>> > 80 18 0
>>> > 0 9 7096 127028 16240 10357652 0 0 5984 0 20853 9158
>>> > 2 1
>>> > 79 18 0
>>> > 9 0 7096 121472 16240 10363492 0 0 5716 0 20520 8489
>>> > 1 1
>>> > 82 16 0
>>> > 3 9 7096 112668 16240 10369872 0 0 6404 0 21314 9459
>>> > 2 1
>>> > 84 13 0
>>> > 1 9 7096 127300 16236 10353440 0 0 5684 0 38914 10068
>>> > 2 1
>>> > 76 21 0
>>> >
>>> >
>>> > But the 16 cores are hardly utilized. Which indicates to me there is
>>> > some
>>> > bad thread thrashing, but why?
>>> >
>>> >
>>> >
>>> > 1 [||||| 8.3%]
>>> > Tasks:
>>> > 1070 total, 1 running
>>> > 2 [ 0.0%]
>>> > Load
>>> > average: 8.34 9.05 8.82
>>> > 3 [ 0.0%]
>>> > Uptime:
>>> > 192 days(!), 15:29:52
>>> > 4 [||||||||||| 17.9%]
>>> > 5 [||||| 5.7%]
>>> > 6 [|| 1.3%]
>>> > 7 [|| 2.6%]
>>> > 8 [| 0.6%]
>>> > 9 [| 0.6%]
>>> > 10 [|| 1.9%]
>>> > 11 [|| 1.9%]
>>> > 12 [|| 1.9%]
>>> > 13 [|| 1.3%]
>>> > 14 [| 0.6%]
>>> > 15 [|| 1.3%]
>>> > 16 [| 0.6%]
>>> > Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028MB]
>>> > Swp[| 6/1983MB]
>>> >
>>> > PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
>>> > 30269 root 40 0 14100 2116 900 R 4.0 0.0 0:00.49 htop
>>> > 24878 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:23.03
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24879 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:22.93
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24874 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.73
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24880 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.93
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24875 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.17
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24658 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.06
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24877 root 40 0 20.6G 8345M 6883M S 2.0 69.4 1:23.43
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24873 root 40 0 20.6G 8345M 6883M D 1.0 69.4 1:23.65
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24876 root 40 0 20.6G 8345M 6883M S 1.0 69.4 1:23.62
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24942 root 40 0 20.6G 8345M 6883M S 1.0 69.4 0:23.50
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24943 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:29.53
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24933 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:22.57
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 24939 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:12.73
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> > 25280 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:00.10
>>> > /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>>> >
>>
>
>
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com