incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Shook <>
Subject Re: Giant sets of ordered data
Date Wed, 02 Jun 2010 15:45:05 GMT
Either OPP by key, or within a row by column name. I'd suggest the latter.
If you have structured data to stick under a column (named by the
timestamp), then you can serialize and unserialize it yourself, or you
can use a supercolumn. It's effectively the same thing.  Cassandra
only provides the super column support as a convenience layer as it is
currently implemented. That may change in the future.

You didn't make clear in your question why a standard column would be
less suitable. I presumed you had layered structure within the
timestamp, hence my response.
How would you logically partition your dataset according to natural
application boundaries? This will answer most of your question.
If you have a dataset which can't be partitioned into a reasonable
size row, then you may want to use OPP and key concatenation.

What do you mean by giant?

On Wed, Jun 2, 2010 at 10:32 AM, David Boxenhorn <> wrote:
> How do I handle giant sets of ordered data, e.g. by timestamps, which I want
> to access by range?
> I can't put all the data into a supercolumn, because it's loaded into memory
> at once, and it's too much data.
> Am I forced to use an order-preserving partitioner? I don't want the
> headache. Is there any other way?

View raw message