AFAIK You could use more nodes and read in parallel from the= m making your read rate go up. Also don’t write and read to the same di= sk may help some. It’s not so much about “Cassandra’s” rea= d rate but what your hardware can manage.

/Justus

Fr=E5n: Dathan Pattis= hall [mailto:dathanvp@gmail.com]
Skickat: den 27 juli 2010 08:56
Till: user@cassandra.apache.org
=C4mne: Re: what causes MESSAGE-DESERIALIZER-POOL to spike

On Mon, Jul 26, 2010 at 8:30 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

MDP is backing up becau= se RRS is full at 4096. This means you're not
able to process reads as quickly as the requests are coming in. Make<= br> whatever is doing those reads be less aggressive.

So, for cassandra to function correctly I need to throttle my reads? What request rate is ideal? 100s reads a second, 1000s? For me I would love to d= o 100s of thousands of reads a second. Is Cassandra not suited for this?

As to why the reads are slow in the first place, usual= ly this means
you are disk i/o bound. Posting your cfstats can help troubleshoot but is no substitute for thinking about your application workload.

How should I think about my application workload? I use cassandra as a distributed hash table accessing Cassandra by individual keys (BigO(1)). I randomly hit a node through a F5 loadbalancer, using the storage CF definit= ion as defined in the sample storage-conf.xml. Each key is no more then 30 byte= s, the value is a time stamp. I store a total for 20 million keys and update 1= .5 million keys a day. Is there anything else I should really think about? Wha= t are the limitations in Cassandra that would effect this workload?

On Mon, Jul 26, 2010 at 12:32 PM, Anthony Molinaro
<anthonym@alumni.caltech.= edu> wrote:
> It's usually I/O which causes backup in MESSAGE-DESERIALIZER-POOL. You
> should check iostat and see what it looks like. It may be that y= ou
> need more nodes in order to deal with the read/write rate. You = can also
> use JMX to get latency values on reads and writes and see if the backu= p
> has a corresponding increase in latency. You may be able to get = more
> out of your hardware and memory with row caching but that really depen= ds
> on your data set.
>
> -Anthony
>
> On Mon, Jul 26, 2010 at 12:22:46PM -0700, Dathan Pattishall wrote:
>> I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i= 7 cores,
>> RAID Disks).
>>
>> ~# /opt/cassandra/bin/nodetool --host=3Dlocalhost --port=3D8181 tp= stats
>> Pool Name Active Pending Completed
>> STREAM-STAGE &nbs= p; 0 0 &n= bsp; 0
>> RESPONSE-STAGE &n= bsp; 0 0 51= 6280
>> ROW-READ-STAGE &n= bsp; 8 4096 1164326<= br> >> LB-OPERATIONS &nb= sp; 0 0 0
>> *MESSAGE-DESERIALIZER-POOL 1 682008 1818682*
>> GMFD = ; 0 0 &n= bsp; 6467
>> LB-TARGET 0 0 0
>> CONSISTENCY-MANAGER &nbs= p; 0 0 661477
>> ROW-MUTATION-STAGE = ; 0 0 998780 >> MESSAGE-STREAMING-POOL 0 0 &nb= sp;0
>> LOAD-BALANCER-STAGE &nbs= p; 0 0 &nb= sp;0
>> FLUSH-SORTER-POOL 0 0 0
>> MEMTABLE-POST-FLUSHER 0 0 &nb= sp;4
>> FLUSH-WRITER-POOL 0 0 4
>> AE-SERVICE-STAGE 0 0 &n= bsp; 0
>> HINTED-HANDOFF-POOL &nbs= p; 0 0 &nb= sp;3
>>
>> EQX root@cass04:~# vmstat -n 1
>>
>> procs -----------memory---------- ---swap-- -----io---- --system--=
>> -----cpu------
>> r b swpd free buff cache &n= bsp; si so bi bo in cs us sy id >> wa st
>> 6 10 7096 121816 16244 10375492 0 0 1 3 0 = 0 5 1
>> 94 0 0
>> 2 10 7096 116484 16248 10381144 0 0 5636 4 21210 9820 2 1
>> 79 18 0
>> 1 9 7096 108920 16248 10387592 &nb= sp;0 0 6216 0 21439 9878 2 1
>> 81 16 0
>> 0 9 7096 129108 16248 10364852 &nb= sp;0 0 6024 0 23280 8753 2 1
>> 80 17 0
>> 2 9 7096 122460 16248 10370908 &nb= sp;0 0 6072 0 20835 9461 2 1
>> 83 14 0
>> 2 8 7096 115740 16260 10375752 &nb= sp;0 0 5168 292 21049 9511 3 1
>> 77 20 0
>> 1 10 7096 108424 16260 10382300 0 0 6244 0 21483 8981 2 1
>> 75 22 0
>> 3 8 7096 125028 16260 10364104 &nb= sp;0 0 5584 0 21238 9436 2 1
>> 81 16 0
>> 3 9 7096 117928 16260 10370064 &nb= sp;0 0 5988 0 21505 10225 2 1
>> 77 19 0
>> 1 8 7096 109544 16260 10376640 &nb= sp;0 0 6340 28 20840 8602 2 1
>> 80 18 0
>> 0 9 7096 127028 16240 10357652 &nb= sp;0 0 5984 0 20853 9158 2 1
>> 79 18 0
>> 9 0 7096 121472 16240 10363492 &nb= sp;0 0 5716 0 20520 8489 1 1
>> 82 16 0
>> 3 9 7096 112668 16240 10369872 &nb= sp;0 0 6404 0 21314 9459 2 1
>> 84 13 0
>> 1 9 7096 127300 16236 10353440 &nb= sp;0 0 5684 0 38914 10068 2 1
>> 76 21 0
>>
>>
>> *But the 16 cores are hardly utilized. Which indicates to me there= is some
>> bad thread thrashing, but why? *
>>
>>
>>
>> 1 [||||| &n= bsp; = ; 8.3%] Tasks:
>> 1070 total, 1 running
>> 2 [ = ; 0.0%] = Load
>> average: 8.34 9.05 8.82
>> 3 [ = ; 0.0%] Uptime:
>> 192 days(!), 15:29:52
>> 4 [||||||||||| &nb= sp; = ; 17.9%]
>> 5 [||||| &n= bsp; = ; 5.7%]
>> 6 [|| = ; = ; 1.3%]
>> 7 [|| = ; = ; 2.6%]
>> 8 [| = ; 0.6%]
>> 9 [| = ; 0.6%]
>> 10 [|| &nb= sp; = ; 1.9%]
>> 11 [|| &nb= sp; = ; 1.9%]
>> 12 [|| &nb= sp; = ; 1.9%]
>> 13 [|| &nb= sp; = ; 1.3%]
>> 14 [| &nbs= p; = ; 0.6%]
>> 15 [|| &nb= sp; = ; 1.3%]
>> 16 [| &nbs= p; = ; 0.6%]
>> Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028M= B]
>> Swp[| &nbs= p; = ; 6/1983MB]
>>
>> PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
>> 30269 root 40 0 14100 2116 = 900 R 4.0 0.0 0:00.49 htop
>> 24878 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:23.03
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24879 root 40 0 20.6G 8345M 6883M D 3.0 69.4 1:22.93
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24874 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.73
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24880 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:22.93
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24875 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.17
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24658 root 40 0 20.6G 8345M 6883M D 2.0 69.4 1:23.06
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24877 root 40 0 20.6G 8345M 6883M S 2.0 69.4 1:23.43
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24873 root 40 0 20.6G 8345M 6883M D 1.0 69.4 1:23.65
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24876 root 40 0 20.6G 8345M 6883M S 1.0 69.4 1:23.62
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24942 root 40 0 20.6G 8345M 6883M S 1.0 69.4 0:23.50
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24943 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:29.53
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24933 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:22.57
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 24939 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:12.73
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>> 25280 root 40 0 20.6G 8345M 6883M S 0.0 69.4 0:00.10
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConc= Mark
>
> --
> ----------------------------------------------------------------------= --
> Anthony Molinaro &nbs= p; <anthonym@alumni.caltech.edu= >
>

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com