cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <>
Subject Re: N to N relationships
Date Thu, 09 Dec 2010 20:26:21 GMT
Am assuming you have one matrix and you know the dimensions. Also as you say the most important
queries are to get an entire column or an entire row

I would consider using a standard CF for the Columns and one for the Rows.  The key for each
would be the col / row number, each cassandra column name would be the id of the other dimension
and the value whatever you want.  

- when storing the data update both the Column and Row CF
- reading a whole row/col would be simply reading from the appropriate CF.
- reading an intersection is a get_slice to either col or row CF using the column_names field
to identify the other dimension. 

You would not need secondary indexes to serve these queries. 

Hope that helps.

On 10 Dec, 2010,at 07:02 AM, Sébastien Druon <> wrote:

I mean if I have secondary indexes. Apparently they are calculated in the background...

On 9 December 2010 18:33, David Boxenhorn <> wrote:
What do you mean by indexing? 

On Thu, Dec 9, 2010 at 7:30 PM, Sébastien Druon <> wrote:
Thanks a lot for the answer

What about the indexing when adding a new element? Is it incremental?

Thanks again

On 9 December 2010 14:38, David Boxenhorn <> wrote:
How about a regular CF where keys are N@N ?

Then, getting a matrix row would be the same cost as getting a matrix column (N gets), and
it would be very easy to add element N+1. 

On Thu, Dec 9, 2010 at 1:48 PM, Sébastien Druon <> wrote:

For a specific case, we are thinking about representing a N to N relationship with a NxN Matrix
in Cassandra.
The relations will be only between a subset of elements, so the Matrix will mostly contain
empty elements.

We have a set of questions concerning this:
- what is the best way to represent this matrix? what would have the best performance in reading?
in writing?
   a super column family with n column families, with n columns each
   a column family with n columns and n lines

In the second case, we would need to extract 2 kinds of information:
- all the relations for a line: this should be no specific problem;
- all the relations for a column: in that case we would need an index for the columns, right?
and then get all the lines where the value of the column in question is not null... is it
the correct way to do?
When using indexes, say we want to add another element N+1. What impact in terms of time would
it have on the indexation job?

Thanks a lot for the answers,

Best regards,

Sébastien Druon

  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message