incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Wintle <timwin...@gmail.com>
Subject Data modeling advice (time series)
Date Tue, 01 May 2012 17:20:02 GMT
I believe that the general design for time-series schemas looks
something like this (correct me if I'm wrong):

(storing time series for X dimensions for Y different users)

Row Keys:  "{USET_ID}_{TIMESTAMP/BUCKETSIZE}"
Columns: "{DIMENSION_ID}_{TIMESTAMP%BUCKETSIZE}" -> {Counter}

But I've not found much advice on calculating optimal bucket sizes (i.e.
optimal number of columns per row), and how that decision might be
affected by compression (or how significant the performance differences
between the two options might be).

Are the calculations here are still considered valid (proportionally) in
1.X, with the changes to SSTables, or is it significantly different?

<http://btoddb-cass-storage.blogspot.co.uk/2011/07/column-overhead-and-sizing-every-column.html>



Thanks,

Tim


Mime
View raw message