cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: Does Cassandra support range queries on keys ?
Date Mon, 24 Jan 2011 08:26:22 GMT
> Following your suggestions, of using key of super column as range token
> won't I have a storage problem?

You won't get me to proclaim that you won't have a storage problem ;)

If you're going to deploy this at scale, I'm sure you'll have problems
whatever you do...

> I couldn't find information about this so I'll just ask: If I have a
> (Super/)ColumnFamily that contains 1 "key" for the row but that row contains
> millions of k:v entries. Would that be a efficient Cassandra design?
> Does cassandra store a CF row on a single now or can it / should it
> distribute this data?
> Does having millions of k:v entries in a single row of a CF would be
> considered a good practice? (in terms of query time, range scans and co ?)

The replication set/distribution is on a per-row basis, so you
generally don't want individual rows to be a significant part of the
entire data set.

You definitely don't want super columns that are huge; individual
super column's columns aren't indexed on disk, for one thing.

Having large rows with lots of columns... maybe. In general it's
certainly supported, but the overall impact if you're intended to have
relatively few rows all being very large - I don't want to say too
much here. Anyone else? (anti-entropy granularity, compaction
in-memory thresholds and GC tweaking, etc)

-- 
/ Peter Schuller

Mime
View raw message