incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pradeep Kumar Mantha <pradeep...@gmail.com>
Subject Re: Pycassa vs YCSB results.
Date Fri, 01 Feb 2013 00:49:23 GMT
Thanks.. Please find the script as attachment.

Just re-iterating.
Its just a simple python script which submit 4 threads.
This script has been scheduled on 8 cores using taskset unix command , thus
running 32 threads/node.
and then scaling to 16 nodes

thanks
pradeep

On Thu, Jan 31, 2013 at 4:38 PM, Tyler Hobbs <tyler@datastax.com> wrote:

> Can you provide the python script that you're using?
>
> (I'm moving this thread to the pycassa mailing list (
> pycassa-discuss@googlegroups.com), which is a better place for this
> discussion.)
>
>
> On Thu, Jan 31, 2013 at 6:25 PM, Pradeep Kumar Mantha <
> pradeepm66@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to benchmark cassandra on a 12 Data Node cluster using 16
>> clients ( each client uses 32 threads) using custom pycassa client and YCSB.
>>
>> I found the maximum number of operations/seconds achieved using pycassa
>> client is nearly 70k+ reads/second.
>> Whereas with YCSB it is ~ 120k reads/second.
>>
>> Any thoughts, why I see this huge difference in performance?
>>
>>
>> Here is the description of setup.
>>
>> Pycassa client (a simple python script).
>> 1. Each pycassa client starts 4 threads - where each thread queries 76896
>> queries.
>> 2. a shell script is used to submit 4threads/each core using taskset unix
>> command on a 8 core single node. ( 8 * 4 * 76896 queries)
>> 3. Another shell script is used to scale the single node shell script to
>> 16 nodes  ( total queries now - 16 * 8 * 4 * 76896 queries )
>>
>> I tried to keep YCSB configuration as much as similar to my custom
>> pycassa benchmarking setup.
>>
>> YCSB -
>>
>> Launched 16 YCSB clients on 16 nodes where each client uses 32 threads
>> for execution and need to query ( 32 * 76896 keys ), i.e 100% reads
>>
>> The dataset is different in each case, but has
>>
>> 1. same number of total records.
>> 2. same number of fields.
>> 3. field length is almost same.
>>
>> Could you please let me know, why I see this huge performance difference
>> and is there any way I can improve the operations/second using pycassa
>> client.
>>
>> thanks
>> pradeep
>>
>>
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Mime
View raw message