cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Jeremey.Barr...@nokia.com>
Subject Re: Cassandra range scans
Date Mon, 22 Feb 2010 20:23:42 GMT
On Feb 22, 2010, at 12:19 AM, ext Jonathan Ellis wrote:

>>  2) is the row key model I suggested above the best approach in Cassandra, or is
there something better? My testing so far has been using get_range_slice with a ColumnParent
of just the CF and SlicePredicate listing the columns I want (though really I want all columns,
is there a shorthand for that?)
> 
> Cassandra deals fine with millions of columns per row, and allows
> prefix queries on columns too.  So an alternate model would be to have
> userX as row key, and column keys "A:1, A:2, A:3, ..., B:1, B:2, B:3,
> ...".  This will be marginally faster than splitting by row, and has
> the added advantage of not requiring OPP.
> 
> You could use supercolumns here too (where the supercolumn name is the
> thing type).  If you always want to retrieve all things of type A at a
> time per user, then that is a more natural fit.  (Otherwise, the lack
> of subcolumn indexing could be a performance gotcha for you:
> http://issues.apache.org/jira/browse/CASSANDRA-598).

Would you say the supercolumn approach is faster than scanning rows? Any particular advantages
or disadvantages to writing to a bunch of supercolumns at once (e.g. in one user row), vs.
writing to a bunch of rows at once (with the same key prefix, i.e. close together in an order-preserved
store)?

> 
>>  3) schema changes (i.e. adding a new CF)... seems like currently you take the whole
cluster down to accomplish this... is that likely to change in the future?
> 
> You have to take each node down, but a rolling restart is fine.  No
> reason for the whole cluster to be down at once.

OK, that's not a big deal.

Extremely helpful... thanks for the response!

Jeremey.


Mime
View raw message