cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Hermes (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-1329) make multiget take a set of keys instead of a list
Date Fri, 03 Sep 2010 15:29:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936
] 

Jon Hermes edited comment on CASSANDRA-1329 at 9/3/10 11:27 AM:
----------------------------------------------------------------

Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is
just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k
block out of the 1M written, so everything should be key-cached and there's bound to be more
duplicates in the set than I planned for.
{noformat}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{noformat}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no
significant difference in time to process a set versus a list.

      was (Author: jhermes):
    Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 100k reads is
just enough to occasionally throw me into GC. Also, I'm only randomizing over the first 100k
block out of the 1M written, so everything should be key-cached and there's bound to be more
duplicates in the set than I planned for.
{{noformat}}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ python contrib/py_stress/stress.py
-o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{{noformat}}
Regardless of the vagaries, the numbers are still comparable, and it looks like there is no
significant difference in time to process a set versus a list.
  
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt
>
>
> this more correctly sets the expectation that the order of keys in that list doesn't
matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message