incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arya Goudarzi <agouda...@gaiaonline.com>
Subject Re: Strage Read Perfoamnce 1xN column slice or N column slice
Date Mon, 07 Jun 2010 18:23:31 GMT
But I am not comparing reading 1 column vs 100 columns. I am comparing reading of 100 columns
in loop iterations (100 consecutive calls) vs reading all 100 in batch in one call. Doing
the loop is faster than doing the batch call. Are you saying this is not surprising? 

----- Original Message -----
From: "Jonathan Ellis" <jbellis@gmail.com>
To: user@cassandra.apache.org
Sent: Saturday, June 5, 2010 6:26:46 AM
Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice

reading 1 column, is faster than reading lots of columns. this
shouldn't be surprising.

On Fri, Jun 4, 2010 at 3:52 PM, Arya Goudarzi <agoudarzi@gaiaonline.com>
wrote:
> Hi Fellows,
>
> I have the following design for a system which holds basically
> key->value pairs (aka Columns) for each user (SuperColumn Key) in
> different namespaces
> (SuperColumnFamily row key).
>
> Like this:
>
> Namesapce->user->column_name = column_value;
>
> keyspaces:
>     - name: NKVP
>       replica_placement_strategy:
> org.apache.cassandra.locator.RackUnawareStrategy
>       replication_factor: 3
>       column_families:
>         - name: Namespaces
>           column_type: Super
>           compare_with: BytesType
>           compare_subcolumns_with: BytesType
>           rows_cached: 20000
>           keys_cached: 100
>
> Cluster using random partitioner.
>
> I use multiget_slice() for fetching 1 or many columns inside the child
> supercolumn at the same time. This is an awkward performance result I
> get:
>
> 100 sequential reads completed in : 0.383 this uses multiget_slice()
> with 1 key, and 1 column name inside the predicate->column_names
> 100 batch loaded completed in : 0.786 this uses multiget_slice() with
> 1 key, and multiple column names inside the predicate->column_names
>
> read/write consistency are ONE.
>
> Questions:
>
> Why doing 100 sequential reads is faster than doing 100 in batch?
> Is this a good design for my problem?
> Does my issue relate to
> https://issues.apache.org/jira/browse/CASSANDRA-598?
>
> Now on a single node with replication factor 1 I get this:
>
> 100 sequential reads completed in : 0.438
> 100 batch loaded completed in : 0.800
>
> Please advice as to why is this happening?
>
> These nodes are VMs. 1 CPU and 1 Gb.
>
> Best Regards,
> =Arya
>
>
>
>
>
>
>
>



-- Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message