Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MIME-Version: 1.0
Sender: scode@scode.org
In-Reply-To: <AANLkTin+RbR=BGruTbmi_SUTaA25=XDNF8rff5yQ4aq+@mail.gmail.com>
References: <AANLkTim7NxBPh5UGe5OPhzAVfHfdpM=JpXnO_iUz94jx@mail.gmail.com>
	<AANLkTikR1tPOAAY5eY+yZFkWKTtZs3kBSxbP1CSzZPDv@mail.gmail.com>
	<AANLkTin+RbR=BGruTbmi_SUTaA25=XDNF8rff5yQ4aq+@mail.gmail.com>
Date: Mon, 24 Jan 2011 09:26:22 +0100
Message-ID: <AANLkTim-D9moRtoD9xm+WbWd7uHXgO4Q+32ggmtFE1CB@mail.gmail.com>
Subject: Re: Does Cassandra support range queries on keys ?
From: Peter Schuller <peter.schuller@infidyne.com>
To: user@cassandra.apache.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

> Following your suggestions, of using key of super column as range token
> won't I have a storage problem?

You won't get me to proclaim that you won't have a storage problem ;)

If you're going to deploy this at scale, I'm sure you'll have problems
whatever you do...

> I couldn't find information about this so I'll just ask: If I have a
> (Super/)ColumnFamily that contains 1 "key" for the row but that row conta=
ins
> millions of k:v entries. Would that be a efficient=C2=A0Cassandra=C2=A0de=
sign?
> Does cassandra store a CF row on a single now or can it / should it
> distribute this data?
> Does having millions of k:v entries in a single row of a CF would be
> considered a good practice? (in terms of query time, range scans and co ?=
)

The replication set/distribution is on a per-row basis, so you
generally don't want individual rows to be a significant part of the
entire data set.

You definitely don't want super columns that are huge; individual
super column's columns aren't indexed on disk, for one thing.

Having large rows with lots of columns... maybe. In general it's
certainly supported, but the overall impact if you're intended to have
relatively few rows all being very large - I don't want to say too
much here. Anyone else? (anti-entropy granularity, compaction
in-memory thresholds and GC tweaking, etc)

--=20
/ Peter Schuller