incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Va┼żan <robert.va...@gmail.com>
Subject Minimum row size / minimum data point size
Date Thu, 03 Oct 2013 20:31:26 GMT
I need to store one trillion data points. The data is highly compressible
down to 1 byte per data point using simple custom compression combined with
standard dictionary compression. What's the most space-efficient way to
store the data in Cassandra? How much per-row overhead is there if I store
one data point per row?

The data is particularly hard to group. It's a large number of time series
with highly variable density. That makes it hard to pack subsets of the
data into meaningful column families / wide rows. Is there a table layout
scheme that would allow me to approach the 1B per data point without
forcing me to implement complex abstraction layer on application level?

Mime
View raw message