incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sébastien Druon <sdr...@spotuse.com>
Subject Re: N to N relationships
Date Mon, 13 Dec 2010 11:23:12 GMT
Thanks a lot for the support!

On Thu, 2010-12-09 at 19:50 -0600, Nick Bailey wrote:
> I would also recommend two column families. Storing the key as NxN
> would require you to hit multiple machines to query for an entire row
> or column with RandomPartitioner. Even with OPP you would need to pick
> row or columns to order by and the other would require hitting
> multiple machines.  Two column families avoids this and avoids any
> problems with choosing OPP.
> 
> On Thu, Dec 9, 2010 at 2:26 PM, Aaron Morton <aaron@thelastpickle.com>
> wrote:
>         Am assuming you have one matrix and you know the dimensions.
>         Also as you say the most important queries are to get an
>         entire column or an entire row.
>         
>         
>         I would consider using a standard CF for the Columns and one
>         for the Rows.  The key for each would be the col / row number,
>         each cassandra column name would be the id of the other
>         dimension and the value whatever you want.  
>         
>         
>         - when storing the data update both the Column and Row CF
>         - reading a whole row/col would be simply reading from the
>         appropriate CF.
>         - reading an intersection is a get_slice to either col or row
>         CF using the column_names field to identify the other
>         dimension. 
>         
>         
>         You would not need secondary indexes to serve these queries. 
>         
>         
>         Hope that helps.
>         Aaron
>         
>         
>         
>         On 10 Dec, 2010,at 07:02 AM, Sébastien Druon
>         <sdruon@spotuse.com> wrote:
>         
>         
>         > I mean if I have secondary indexes. Apparently they are
>         > calculated in the background...
>         > 
>         > On 9 December 2010 18:33, David Boxenhorn
>         > <david@lookin2.com> wrote:
>         >         What do you mean by indexing? 
>         >         
>         >         
>         >         
>         >         On Thu, Dec 9, 2010 at 7:30 PM, Sébastien Druon
>         >         <sdruon@spotuse.com> wrote:
>         >                 Thanks a lot for the answer
>         >                 
>         >                 
>         >                 What about the indexing when adding a new
>         >                 element? Is it incremental?
>         >                 
>         >                 
>         >                 Thanks again
>         >                 
>         >                 
>         >                 
>         >                 
>         >                 On 9 December 2010 14:38, David Boxenhorn
>         >                 <david@lookin2.com> wrote:
>         >                         How about a regular CF where keys
>         >                         are N@N ?
>         >                         
>         >                         Then, getting a matrix row would be
>         >                         the same cost as getting a matrix
>         >                         column (N gets), and it would be
>         >                         very easy to add element N+1. 
>         >                         
>         >                         
>         >                         
>         >                         
>         >                         On Thu, Dec 9, 2010 at 1:48 PM,
>         >                         Sébastien Druon <sdruon@spotuse.com>
>         >                         wrote:
>         >                                 Hello,
>         >                                 
>         >                                 
>         >                                 For a specific case, we are
>         >                                 thinking about representing
>         >                                 a N to N relationship with a
>         >                                 NxN Matrix in Cassandra.
>         >                                 The relations will be only
>         >                                 between a subset of
>         >                                 elements, so the Matrix will
>         >                                 mostly contain empty
>         >                                 elements.
>         >                                 
>         >                                 
>         >                                 We have a set of questions
>         >                                 concerning this:
>         >                                 - what is the best way to
>         >                                 represent this matrix? what
>         >                                 would have the best
>         >                                 performance in reading? in
>         >                                 writing?
>         >                                   . a super column family
>         >                                 with n column families, with
>         >                                 n columns each
>         >                                   . a column family with n
>         >                                 columns and n lines
>         >                                 
>         >                                 
>         >                                 In the second case, we would
>         >                                 need to extract 2 kinds of
>         >                                 information:
>         >                                 - all the relations for a
>         >                                 line: this should be no
>         >                                 specific problem;
>         >                                 - all the relations for a
>         >                                 column: in that case we
>         >                                 would need an index for the
>         >                                 columns, right? and then get
>         >                                 all the lines where the
>         >                                 value of the column in
>         >                                 question is not null... is
>         >                                 it the correct way to do?
>         >                                 When using indexes, say we
>         >                                 want to add another element
>         >                                 N+1. What impact in terms of
>         >                                 time would it have on the
>         >                                 indexation job?
>         >                                 
>         >                                 
>         >                                 Thanks a lot for the
>         >                                 answers,
>         >                                 
>         >                                 
>         >                                 Best regards,
>         >                                 
>         >                                 
>         >                                 Sébastien Druon
>         >                         
>         >                         
>         >                 
>         >                 
>         >         
>         >         
>         > 
>         > 
> 



Mime
View raw message