incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorvaldsson Justus <justus.thorvalds...@svenskaspel.se>
Subject SV: what causes MESSAGE-DESERIALIZER-POOL to spike
Date Tue, 27 Jul 2010 07:01:47 GMT
AFAIK You could use more nodes and read in parallel from them making your read rate go up.
Also don't write and read to the same disk may help some. It's not so much about "Cassandra's"
read rate but what your hardware can manage.

/Justus

Från: Dathan Pattishall [mailto:dathanvp@gmail.com]
Skickat: den 27 juli 2010 08:56
Till: user@cassandra.apache.org
Ämne: Re: what causes MESSAGE-DESERIALIZER-POOL to spike


On Mon, Jul 26, 2010 at 8:30 PM, Jonathan Ellis <jbellis@gmail.com<mailto:jbellis@gmail.com>>
wrote:
MDP is backing up because RRS is full at 4096.  This means you're not
able to process reads as quickly as the requests are coming in.  Make
whatever is doing those reads be less aggressive.

So, for cassandra to function correctly I need to throttle my reads? What request rate is
ideal? 100s reads a second, 1000s? For me I would love to do 100s of thousands of reads a
second. Is Cassandra not suited for this?


As to why the reads are slow in the first place, usually this means
you are disk i/o bound.  Posting your cfstats can help troubleshoot
but is no substitute for thinking about your application workload.

How should I think about my application workload? I use cassandra as a distributed hash table
accessing Cassandra by individual keys (BigO(1)). I randomly hit a node through a F5 loadbalancer,
using the storage CF definition as defined in the sample storage-conf.xml. Each key is no
more then 30 bytes, the value is a time stamp. I store a total for 20 million keys and update
1.5 million keys a day. Is there anything else I should really think about? What are the limitations
in Cassandra that would effect this workload?





On Mon, Jul 26, 2010 at 12:32 PM, Anthony Molinaro
<anthonym@alumni.caltech.edu<mailto:anthonym@alumni.caltech.edu>> wrote:
> It's usually I/O which causes backup in MESSAGE-DESERIALIZER-POOL.  You
> should check iostat and see what it looks like.  It may be that you
> need more nodes in order to deal with the read/write rate.   You can also
> use JMX to get latency values on reads and writes and see if the backup
> has a corresponding increase in latency.  You may be able to get more
> out of your hardware and memory with row caching but that really depends
> on your data set.
>
> -Anthony
>
> On Mon, Jul 26, 2010 at 12:22:46PM -0700, Dathan Pattishall wrote:
>> I have 4 nodes on enterprise type hardware (Lots of Ram 12GB, 16 i7 cores,
>> RAID Disks).
>>
>> ~# /opt/cassandra/bin/nodetool --host=localhost --port=8181 tpstats
>> Pool Name                    Active   Pending      Completed
>> STREAM-STAGE                      0         0              0
>> RESPONSE-STAGE                    0         0         516280
>> ROW-READ-STAGE                    8      4096        1164326
>> LB-OPERATIONS                     0         0              0
>> *MESSAGE-DESERIALIZER-POOL         1    682008        1818682*
>> GMFD                              0         0           6467
>> LB-TARGET                         0         0              0
>> CONSISTENCY-MANAGER               0         0         661477
>> ROW-MUTATION-STAGE                0         0         998780
>> MESSAGE-STREAMING-POOL            0         0              0
>> LOAD-BALANCER-STAGE               0         0              0
>> FLUSH-SORTER-POOL                 0         0              0
>> MEMTABLE-POST-FLUSHER             0         0              4
>> FLUSH-WRITER-POOL                 0         0              4
>> AE-SERVICE-STAGE                  0         0              0
>> HINTED-HANDOFF-POOL               0         0              3
>>
>> EQX root@cass04:~# vmstat -n 1
>>
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> -----cpu------
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
>> wa st
>>  6 10   7096 121816  16244 10375492    0    0     1     3    0    0  5  1
>> 94  0  0
>>  2 10   7096 116484  16248 10381144    0    0  5636     4 21210 9820  2  1
>> 79 18  0
>>  1  9   7096 108920  16248 10387592    0    0  6216     0 21439 9878  2  1
>> 81 16  0
>>  0  9   7096 129108  16248 10364852    0    0  6024     0 23280 8753  2  1
>> 80 17  0
>>  2  9   7096 122460  16248 10370908    0    0  6072     0 20835 9461  2  1
>> 83 14  0
>>  2  8   7096 115740  16260 10375752    0    0  5168   292 21049 9511  3  1
>> 77 20  0
>>  1 10   7096 108424  16260 10382300    0    0  6244     0 21483 8981  2  1
>> 75 22  0
>>  3  8   7096 125028  16260 10364104    0    0  5584     0 21238 9436  2  1
>> 81 16  0
>>  3  9   7096 117928  16260 10370064    0    0  5988     0 21505 10225  2  1
>> 77 19  0
>>  1  8   7096 109544  16260 10376640    0    0  6340    28 20840 8602  2  1
>> 80 18  0
>>  0  9   7096 127028  16240 10357652    0    0  5984     0 20853 9158  2  1
>> 79 18  0
>>  9  0   7096 121472  16240 10363492    0    0  5716     0 20520 8489  1  1
>> 82 16  0
>>  3  9   7096 112668  16240 10369872    0    0  6404     0 21314 9459  2  1
>> 84 13  0
>>  1  9   7096 127300  16236 10353440    0    0  5684     0 38914 10068  2  1
>> 76 21  0
>>
>>
>> *But the 16 cores are hardly utilized. Which indicates to me there is some
>> bad thread thrashing, but why? *
>>
>>
>>
>>   1  [|||||                                               8.3%]     Tasks:
>> 1070 total, 1 running
>>   2  [                                                    0.0%]     Load
>> average: 8.34 9.05 8.82
>>   3  [                                                    0.0%]     Uptime:
>> 192 days(!), 15:29:52
>>   4  [|||||||||||                                        17.9%]
>>   5  [|||||                                               5.7%]
>>   6  [||                                                  1.3%]
>>   7  [||                                                  2.6%]
>>   8  [|                                                   0.6%]
>>   9  [|                                                   0.6%]
>>   10 [||                                                  1.9%]
>>   11 [||                                                  1.9%]
>>   12 [||                                                  1.9%]
>>   13 [||                                                  1.3%]
>>   14 [|                                                   0.6%]
>>   15 [||                                                  1.3%]
>>   16 [|                                                   0.6%]
>>   Mem[||||||||||||||||||||||||||||||||||||||||||||1791/12028MB]
>>   Swp[|                                               6/1983MB]
>>
>>   PID USER     PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
>> 30269 root      40   0 14100  2116   900 R  4.0  0.0  0:00.49 htop
>> 24878 root      40   0 20.6G 8345M 6883M D  3.0 69.4  1:23.03
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24879 root      40   0 20.6G 8345M 6883M D  3.0 69.4  1:22.93
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24874 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:22.73
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24880 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:22.93
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24875 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:23.17
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24658 root      40   0 20.6G 8345M 6883M D  2.0 69.4  1:23.06
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24877 root      40   0 20.6G 8345M 6883M S  2.0 69.4  1:23.43
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24873 root      40   0 20.6G 8345M 6883M D  1.0 69.4  1:23.65
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24876 root      40   0 20.6G 8345M 6883M S  1.0 69.4  1:23.62
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24942 root      40   0 20.6G 8345M 6883M S  1.0 69.4  0:23.50
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24943 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:29.53
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24933 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:22.57
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 24939 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:12.73
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>> 25280 root      40   0 20.6G 8345M 6883M S  0.0 69.4  0:00.10
>> /opt/java/bin/java -ea -Xms1G -Xmx7G -XX:+UseParNewGC -XX:+UseConcMark
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <anthonym@alumni.caltech.edu<mailto:anthonym@alumni.caltech.edu>>
>


--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Mime
View raw message