cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Strage Read Perfoamnce 1xN column slice or N column slice
Date Tue, 08 Jun 2010 02:26:30 GMT
That would be surprising (and it is not what you said in the first
message).  I suspect something is wrong with your test methodology.

On Mon, Jun 7, 2010 at 11:23 AM, Arya Goudarzi <agoudarzi@gaiaonline.com> wrote:
> But I am not comparing reading 1 column vs 100 columns. I am comparing reading of 100
columns in loop iterations (100 consecutive calls) vs reading all 100 in batch in one call.
Doing the loop is faster than doing the batch call. Are you saying this is not surprising?
>
> ----- Original Message -----
> From: "Jonathan Ellis" <jbellis@gmail.com>
> To: user@cassandra.apache.org
> Sent: Saturday, June 5, 2010 6:26:46 AM
> Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice
>
> reading 1 column, is faster than reading lots of columns. this
> shouldn't be surprising.
>
> On Fri, Jun 4, 2010 at 3:52 PM, Arya Goudarzi <agoudarzi@gaiaonline.com>
> wrote:
>> Hi Fellows,
>>
>> I have the following design for a system which holds basically
>> key->value pairs (aka Columns) for each user (SuperColumn Key) in
>> different namespaces
>> (SuperColumnFamily row key).
>>
>> Like this:
>>
>> Namesapce->user->column_name = column_value;
>>
>> keyspaces:
>>     - name: NKVP
>>       replica_placement_strategy:
>> org.apache.cassandra.locator.RackUnawareStrategy
>>       replication_factor: 3
>>       column_families:
>>         - name: Namespaces
>>           column_type: Super
>>           compare_with: BytesType
>>           compare_subcolumns_with: BytesType
>>           rows_cached: 20000
>>           keys_cached: 100
>>
>> Cluster using random partitioner.
>>
>> I use multiget_slice() for fetching 1 or many columns inside the child
>> supercolumn at the same time. This is an awkward performance result I
>> get:
>>
>> 100 sequential reads completed in : 0.383 this uses multiget_slice()
>> with 1 key, and 1 column name inside the predicate->column_names
>> 100 batch loaded completed in : 0.786 this uses multiget_slice() with
>> 1 key, and multiple column names inside the predicate->column_names
>>
>> read/write consistency are ONE.
>>
>> Questions:
>>
>> Why doing 100 sequential reads is faster than doing 100 in batch?
>> Is this a good design for my problem?
>> Does my issue relate to
>> https://issues.apache.org/jira/browse/CASSANDRA-598?
>>
>> Now on a single node with replication factor 1 I get this:
>>
>> 100 sequential reads completed in : 0.438
>> 100 batch loaded completed in : 0.800
>>
>> Please advice as to why is this happening?
>>
>> These nodes are VMs. 1 CPU and 1 Gb.
>>
>> Best Regards,
>> =Arya
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
> -- Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message