incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sameer Farooqui <cassandral...@gmail.com>
Subject Problems with Python Stress Test
Date Fri, 04 Feb 2011 01:02:45 GMT
Hi guys,

I was playing around with the stress.py test this week and noticed a few
things.

1) Progress-interval does not always work correctly. I set it to 5 in the
example below, but am instead getting varying intervals:

*techlabs@cassandraN1:~/apache-cassandra-0.7.0-src/contrib/py_stress$ python
stress.py --num-keys=100000 --columns=5 --column-size=32 --operation=insert
--progress-interval=5 --threads=4 --nodes=170.252.179.222
Keyspace already exists.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
6662,1332,1335,0.00307796342135,5
11607,989,988,0.00476862022199,12
20297,1738,1736,0.00273238550807,18
30631,2066,2068,0.00202261635614,24
37291,1332,1331,0.00325975901372,29
47514,2044,2044,0.00193106963725,35
56618,1820,1821,0.00276346638249,41
68652,2406,2406,0.00179436958884,47
77745,1818,1820,0.00220694060007,52
87351,1921,1918,0.00236015612201,58
97167,1963,1963,0.00230505042379,64
100000,566,566,0.00223569174853,66*


2) The key_rate and op_rate doesn't seem to be calculated correctly. Also,
what is the difference between the interval_key_rate and the
interval_op_rate? For example in the example above, the first row shows 6662
keys inserted in 5 seconds and 6662 / 5 = 1332, which matches the
interval_op_rate.

The second row took 7 seconds to update instead of the requested 5. However,
the interval_op_rate and interval_key_rate are being calculated based on my
requested 5 seconds instead of the actual observed 7 seconds.

(11607-6662)/5=989
(11607-6662)/7 = 706

Shouldn't it be basing the calculations off the 7 seconds?


3) If I write x KB to Cassandra with py_stress, the used disk space doesn't
grow by x after the test. In the example below I tried to write 500,000 keys
* 32 bytes * 5 columns = 78,125 kilobytes of data to the database. When I
checked the amount of disk space used after the test it actually grew by
2,684,920 - 2,515,864 = 169,056 kilobytes. Is this because perhaps the
commit log got duplicate copies of the data as the SSTables?

Also, notice how to progress interval got thrown off after 40 seconds.


techlabs@cassandraN1:~/apache-cassandra-0.7.0-src/contrib/py_stress$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/cassandra7rc4-root
                       7583436   *2515864   *4682344  35% /
none                    633244       208    633036   1% /dev
none                    640368         0    640368   0% /dev/shm
none                    640368        56    640312   1% /var/run
none                    640368         0    640368   0% /var/lock
/dev/sda1               233191     20601    200149  10% /boot

techlabs@cassandraN1:~/apache-cassandra-0.7.0-src/contrib/py_stress$ python
stress.py --num-keys=500000 --columns=5 --operation=insert
--progress-interval=5 --threads=1 --nodes=170.252.179.222
Keyspace already exists.
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
15562,3112,3112,0.000300011955333,5
31643,3216,3216,0.000290757187504,10
42968,2265,2265,0.000423845265875,15
54071,2220,2220,0.000430288759747,20
66491,2484,2484,0.000382423304897,25
79891,2680,2680,0.000351728307667,30
91758,2373,2373,0.000402696775367,35
102179,2084,2084,0.000461982612291,40
114003,2364,2364,0.000403893998092,46
126509,2501,2501,0.000379724634489,51
138047,2307,2307,0.000414365229356,56
150261,2442,2442,0.000390332772296,61
164019,2751,2751,0.000343320345113,66
175390,2274,2274,0.000421584286756,71
186564,2234,2234,0.000429319251473,76
198292,2345,2345,0.00040838057315,81
210186,2378,2378,0.000400560030882,87
225144,2991,2991,0.000314564943345,92
236474,2266,2266,0.000422214746265,97
249940,2693,2693,0.000349487200297,102
264410,2894,2894,0.00030166366303,107
275429,2203,2203,0.000464002475276,112
286430,2200,2200,0.00043832517821,117
299217,2557,2557,0.000371891478764,122
313800,2916,2916,0.000322412596002,128
325252,2290,2290,0.000417413284343,133
336031,2155,2155,0.000445155976201,138
347257,2245,2245,0.000426658924816,143
357493,2047,2047,0.000472509730556,148
372151,2931,2931,0.000321278794594,153
384655,2500,2500,0.000381667455343,158
395604,2189,2189,0.000439286896144,163
409713,2821,2821,0.000334938358759,168
423162,2689,2689,0.000351835071877,174
434276,2222,2222,0.000432009316829,179
444809,2106,2106,0.00045844612893,184
458190,2676,2676,0.000353130326037,189
470852,2532,2532,0.000374360740552,194
481333,2096,2096,0.000462788910416,199
492458,2225,2225,0.000431290422932,204
500000,1508,1508,0.000353647808408,207


techlabs@cassandraN1:~/apache-cassandra-0.7.0-src/contrib/py_stress$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/cassandra7rc4-root
                       7583436   2684920   4513288  38% /
none                    633244       208    633036   1% /dev
none                    640368         0    640368   0% /dev/shm
none                    640368        56    640312   1% /var/run
none                    640368         0    640368   0% /var/lock
/dev/sda1               233191     20601    200149  10% /boot



- Sameer

Mime
View raw message