cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Cassandra Secondary index/Twissandra
Date Sun, 10 Jul 2011 17:31:24 GMT
> Can you recommend on a better way of doing that or a way to tune Cassandra to support
those 2 CF?
A select with no start or finish column name, a column count and not in reversed order is
about the fastest read query. 

You will need to do a reversed query, which will be a little slower. But may still be plenty
fast enough, depending on scale and throughput and all those other things. see http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/

Cheers


-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 10 Jul 2011, at 00:14, Eldad Yamin wrote:

> Aaron - Thank you for the fast response!
> 
>> Does performance decrease (significantly) if the uniqueness of the column’s name
is high when comparator is LONG_TYPE/TimeUUID and each row has lots of columns?
> 
> >Depends on what sort of operations you are doing. Some read operations have to pay
a constant cost to decode the row level column index, this can be tuned though. AFAIK the
comparator type has very little to do with the performance. 
> 
> In Twissandra, the columns are used as "alternative" index for the Userline/Timeline.
therefore the operation I'm going to do is slice_range.
> I'm going to get (for example) the first 50  columns (using comparator of TimeUUID/LONG).
> Can you recommend on a better way of doing that or a way to tune Cassandra to support
those 2 CF?
> 
> 
> Thanks!
> 
> On Sun, Jul 10, 2011 at 3:26 AM, aaron morton <aaron@thelastpickle.com> wrote:
>> Is there a limit on the number of columns in a single column family that serve as
secondary indexes? 
> 
> AFAIK there is no coded limit, however every index is implemented as another (hidden)
Column Family that inherits the settings of the parent CF. So under 0.7 you may run out of
memory, under 0.8 you may flush  a lot. Also, when an indexed column is updated there are
potentially 3 operations that have to happen: read the old value, delete the old value, write
the new value. More indexes == more index updating, just like any other database. 
>> Does performance decrease (significantly) if the uniqueness of the column’s values
is high?
> Low cardinality is recommended
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Secondary-indices-Why-low-cardinality-td6160509.html
> 
>> The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and not TimeUUID?
> 
> Probably just to make the demo easier. It's used to order tweets in the user and public
timelines by the current time 
> https://github.com/twissandra/twissandra/blob/master/cass.py#L204
> 
>> Does performance decrease (significantly) if the uniqueness of the column’s name
is high when comparator is LONG_TYPE/TimeUUID and each row has lots of columns?
> 
> Depends on what sort of operations you are doing. Some read operations have to pay a
constant cost to decode the row level column index, this can be tuned though. AFAIK the comparator
type has very little to do with the performance. 
> 
> Hope that helps. 
> 
> -----------------
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 9 Jul 2011, at 12:15, Eldad Yamin wrote:
> 
>> Hi,
>> I have few questions:
>> 
>> Secondary index
>> Is there a limit on the number of columns in a single column family that serve as
secondary indexes? 
>> Does performance decrease (significantly) if the uniqueness of the column’s values
is high?
>> 
>> Twissandra
>> Why in the source (or any tutorial I've read):
>> The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and not TimeUUID?
>> https://github.com/twissandra/twissandra/blob/master/tweets/management/commands/sync_cassandra.py
>> Does performance decrease (significantly) if the uniqueness of the column’s name
is high when comparator is LONG_TYPE/TimeUUID and each row has lots of columns?
>> 
>> Thanks!
>> Eldad
> 
> 


Mime
View raw message