incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bartosz Kołodziej <bartosz.kolodz...@gmail.com>
Subject Re: Need a little help with data model design
Date Mon, 05 Jul 2010 16:42:22 GMT
I have big and dynamic number of loggers.

According to this https://issues.apache.org/jira/browse/CASSANDRA-16 2GB
size limit is no longer an issue in 0.7 (btw mnesia has similar issue ;-) )
I think I can go with svn release at the moment.

Solving this by composite key (logger+timestamp) would require
OrderPreservingPartitioner to make efficient range queries, while in first
approach in can go with RandomPartitioner (data would be partitioned by
logger - simple and effective).

Btw which model provides faster queries ?
(i need only to get slice (timestamp1 to timestmap2) of data for logger X )

On Mon, Jul 5, 2010 at 6:23 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> You don't want to have all the data from a single logger in a single
> row b/c of the 2GB size limit.
>
> If you have a small, static number of loggers you could create one CF
> per logger and use timestamp as the row key.  Otherwise use a
> composite key (logger+timestamp) as the key in a single CF.
>
> 2010/7/2 Bartosz Kołodziej <bartosz.kolodziej@gmail.com>:
> > I'm new to cassandra, and I want use it to store:
> > loggers = { // (super)ColumnFamily ?
> >     logger1 : { // row inside super CF ?
> >         timestamp1 : {
> >             value : 10
> >         },
> >         timestamp2 : {
> >             value : 12
> >         }
> >         (many many many more)
> >     }
> >     logger2 : { //logger of diffrent type (in this example it logs 3
> values
> > instead of 1)
> >         timestamp1 : {
> >             v : 300,
> >             c : 123,
> >             s : 12.13
> >         },
> >         timestamp2 : {
> >             v : 300
> >             c : 123
> >             s : 12.13
> >         }
> >         (many many many more)
> >     }
> >     (many many many more)
> > }
> > the only way i will be accesing this data is:
> > - example: fetch slice of data from logger2 ( start = 1278009131
> (timestmap)
> > , end = 1278109131 )
> >      expecting sorted array of data.
> > - example: fetch slice of data from (logger2 and logger10 and logger20
> and
> > logger1234) ( start = 1278009131 (timestmap) , end = 1278109131 )
> >      expecting map of sorted arrays of data. [it is basically N queries
> of
> > first type]
> > is this right definition of above: <ColumnFamily
> CompareWith="TimeUUIDType"
> > ColumnType="Super"
> >     CompareSubcolumnsWith="BytesType" Name="loggers"/> ?
> > what's the best way to model this data in cassadra (keeping in mind
> > partitioning and other important stuff) ?
> >
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Mime
View raw message