cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <watche...@gmail.com>
Subject Re: RE : Re: RE : Re: Two dimensional matrices
Date Wed, 14 Apr 2010 11:43:20 GMT
> I'm confused : don't range queries such as the ones we've been

> > discussing require using an orderedpartitionner ?
>
> Alright, so distribution depends on your choice of token.
>
Ah yes, I get it now : with a naive orderedpartitioner, the key is
associated with the node whose token is the closest numerically-wise and
that is where the "master" replica is located. Yes ?

Now let's assume I am using super columns as {X} and columns as {timeFrame}.
In time each row will grow very large because X can (very sparsly) go to
2^28
i) does cassandra load all columns everytime it reads a row ? Same question
for super column
ii) Similarly does it cache all columns in memory ?

Now some order of magnitudes, let's say a row is about 20KB and the cluster
is running smoothly on low-end servers. There are millions of rows per node.
i) If I were to only issue gets on the key, what is the order of magnitude I
can expect to reach : 10/s, 100/s, 1000/s or 10.000/s ?
ii) If I were to issue a slice on just the keys, does cassandra optimize the
gets or does it run every get on the server and then concatenate to send to
the client ?
iii) is slicing on the columns going to improve the time to get the data on
the server side or does it just cut down on network traffic ?

Thanks
Philippe

Mime
View raw message