incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Janne Jalkanen <Janne.Jalka...@ecyrd.com>
Subject Re: Minor question on index design
Date Wed, 15 Sep 2010 18:56:16 GMT

Ok, thanks.  I'm going with Option 1, and try to steer away from  
SuperColumns. That also gives me the option to tweak the caches  
depending on the use pattern (User CF will be accessed in a lot of  
different ways, not just with relation to Objects).

/Janne

On Sep 14, 2010, at 23:46 , Aaron Morton wrote:

> I've been doing option 1 under 0.6. As usual in cassandra though a  
> lot depends on how you access the data.
>
> - If you often want to get the user and all of the objects they  
> have, use option 2. It's easier to have one read from one CF to  
> answer your query.
> - If the user has potentially >10k objects go with option 2. AFAIK  
> large super columns are still inefficient https://issues.apache.org/jira/browse/CASSANDRA-674

>  https://issues.apache.org/jira/browse/CASSANDRA-598
> - In your OwnerIndex CF consider making the column name something  
> meaningful such as the Object Name or Timestamp (if it has one) so  
> you can slice against it, e.g. to support paging operations. Make  
> the column value the key for the object.
>
> Aaron
>
>
> On 15 Sep, 2010,at 02:41 AM, Janne Jalkanen  
> <janne.jalkanen@ecyrd.com> wrote:
>
>> Hi all!
>>
>> I'm pondering between a couple of alternatives here: I've got two  
>> CFs, one which contains Objects, and one which contains Users. Now,  
>> each Object has an owner associated to it, so obviously I need some  
>> sort of an index to point from Users to Objects. This would be of  
>> course the perfect usecase for secondary indices on 0.7, but I'm  
>> still on 0.6.x.
>>
>> So, esteemed Cassandra-heads, I'm pondering what would be a better  
>> design here:
>>
>> 1) I can create a separate CF "OwnerIdx" which has user id's as  
>> keys, and then each of the columns points at an object (with a  
>> dummy value, since I just need a list). This would add a new CF,  
>> but on the other hand, this would be easy to drop once 0.7 comes  
>> along and I can just make a index query to the Objects CF, OR
>>
>> 2) Put the index inside the Users CF, with "object:<id>" for column  
>> name and a dummy value, and then get slices as necessary? This  
>> would mean less CFs (and hence no schema modification), but might  
>> mean that I have to clean it up at some point.
>>
>> I don't yet have a lot of CFs, so I'm not worried about mem  
>> consumption really. The Users CF is very read-heavy as-is, but the  
>> index and Objects will be a bit more balanced.
>>
>> Experiences? Recommendations? Tips? Other possibilities? What other  
>> considerations should I take into account?
>>
>> /Janne


Mime
View raw message