I've been doing option 1 under 0.6. As usual in cassandra though a lot depends on how you access the data.
- If you often want to get the user and all of the objects they have, use option 2. It's easier to have one read from one CF to answer your query.
- In your OwnerIndex CF consider making the column name something meaningful such as the Object Name or Timestamp (if it has one) so you can slice against it, e.g. to support paging operations. Make the column value the key for the object.
On 15 Sep, 2010,at 02:41 AM, Janne Jalkanen <firstname.lastname@example.org> wrote:
I'm pondering between a couple of alternatives here: I've got two CFs, one which contains Objects, and one which contains Users. Now, each Object has an owner associated to it, so obviously I need some sort of an index to point from Users to Objects. This would be of course the perfect usecase for secondary indices on 0.7, but I'm still on 0.6.x.
So, esteemed Cassandra-heads, I'm pondering what would be a better design here:
1) I can create a separate CF "OwnerIdx" which has user id's as keys, and then each of the columns points at an object (with a dummy value, since I just need a list). This would add a new CF, but on the other hand, this would be easy to drop once 0.7 comes along and I can just make a index query to the Objects CF, OR
2) Put the index inside the Users CF, with "object:<id>" for column name and a dummy value, and then get slices as necessary? This would mean less CFs (and hence no schema modification), but might mean that I have to clean it up at some point.
I don't yet have a lot of CFs, so I'm not worried about mem consumption really. The Users CF is very read-heavy as-is, but the index and Objects will be a bit more balanced.
Experiences? Recommendations? Tips? Other possibilities? What other considerations should I take into account?