cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From gabriele renzi <>
Subject Re: Modeling question
Date Thu, 26 Nov 2009 21:34:26 GMT
On Thu, Nov 26, 2009 at 7:12 PM, Anthony Molinaro
<> wrote:

> Unless you are using order preserving partitioning which might or might not
> be what you want, you won't be able to do a full scan.  Instead you should
> probably have two column families, one keyed by primary, one by secondary,
> each with a column for the other, then you can do you operations.  It
> uses more space, but disk is cheap so probably not a big deal.

yes, we thought so, using the second column family to only keep a list
of the keys in the former without the data.

> If you
> have to model a many-to-many relationship you can use super columns.

For now we are only storing a single attribute data, so we used normal
columns instead of super columns, so in the end our schema is
{ 'primary' => {'secondary'=>'data_0'} }

I believe that using a SuperColumn in PrimaryCF would be necessary
only when using more than one attribute, or are there other
implications I'm not seeing?
As for the secondary, I don't like the idea of storing a dummy value
(new byte[0]) when I only need the name, is that a smell that I should
be using something else?


> You do your inserts into both, and for deletes you do a get_slice for the
> secondary id, which will give you all primary ids which contain the
> secondary id.  Then you can delete everything.

yes, we actually did it a bit "smarter" by querying first, and keeping
a list of only the diff between the first and second insert. Thanks a
lot for your answer, it's been very useful.

View raw message