cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arya Goudarzi <agouda...@gaiaonline.com>
Subject Strage Read Perfoamnce 1xN column slice or N column slice
Date Fri, 04 Jun 2010 22:52:20 GMT

Hi Fellows, 

I have the following design for a system which holds basically key->value pairs (aka Columns)
for each user (SuperColumn Key) in different namespaces (SuperColumnFamily row key). 

Like this: 

Namesapce->user->column_name = column_value; 

keyspaces: 
- name: NKVP 
replica_placement_strategy: org.apache.cassandra.locator.RackUnawareStrategy 
replication_factor: 3 
column_families: 
- name: Namespaces 
column_type: Super 
compare_with: BytesType 
compare_subcolumns_with: BytesType 
rows_cached: 20000 
keys_cached: 100 

Cluster using random partitioner. 

I use multiget_slice() for fetching 1 or many columns inside the child supercolumn at the
same time. This is an awkward performance result I get: 

100 sequential reads completed in : 0.383 this uses multiget_slice() with 1 key, and 1 column
name inside the predicate->column_names 
100 batch loaded completed in : 0.786 this uses multiget_slice() with 1 key, and multiple
column names inside the predicate->column_names 

read/write consistency are ONE. 

Questions: 

Why doing 100 sequential reads is faster than doing 100 in batch? 
Is this a good design for my problem? 
Does my issue relate to https://issues.apache.org/jira/browse/CASSANDRA-598? 

Now on a single node with replication factor 1 I get this: 

100 sequential reads completed in : 0.438 
100 batch loaded completed in : 0.800 

Please advice as to why is this happening? 

These nodes are VMs. 1 CPU and 1 Gb. 

Best Regards, 
=Arya 








Mime
View raw message