cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: Minor question on index design
Date Tue, 14 Sep 2010 20:46:18 GMT
I've been doing option 1 under 0.6. As usual in cassandra though a lot depends on how you access
the data. 

- If you often want to get the user and all of the objects they have, use option 2. It's easier
to have one read from one CF to answer your query. 
- If the user has potentially >10k objects go with option 2. AFAIK large super columns
are still inefficient
- In your OwnerIndex CF consider making the column name something meaningful such as the Object
Name or Timestamp (if it has one) so you can slice against it, e.g. to support paging operations.
Make the column value the key for the object. 


On 15 Sep, 2010,at 02:41 AM, Janne Jalkanen <> wrote:

Hi all!

I'm pondering between a couple of alternatives here: I've got two CFs, one which contains
Objects, and one which contains Users. Now, each Object has an owner associated to it, so
obviously I need some sort of an index to point from Users to Objects. This would be of course
the perfect usecase for secondary indices on 0.7, but I'm still on 0.6.x.

So, esteemed Cassandra-heads, I'm pondering what would be a better design here:

1) I can create a separate CF "OwnerIdx" which has user id's as keys, and then each of the
columns points at an object (with a dummy value, since I just need a list). This would add
a new CF, but on the other hand, this would be easy to drop once 0.7 comes along and I can
just make a index query to the Objects CF, OR

2) Put the index inside the Users CF, with "object:<id>" for column name and a dummy
value, and then get slices as necessary? This would mean less CFs (and hence no schema modification),
but might mean that I have to clean it up at some point.

I don't yet have a lot of CFs, so I'm not worried about mem consumption really. The Users
CF is very read-heavy as-is, but the index and Objects will be a bit more balanced.

Experiences? Recommendations? Tips? Other possibilities? What other considerations should
I take into account?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message