cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <da...@lookin2.com>
Subject Distribution Factor: part of the solution to many-CF problem?
Date Mon, 21 Feb 2011 12:28:19 GMT
Cassandra is both distributed and replicated. We have Replication Factor but
no Distribution Factor!

Distribution Factor would define over how many nodes a CF should be
distributed.

Say you want to support millions of multi-tenant users in clusters with
thousands of nodes, where you don't know the user's schema in advance, so
you can't have users share CFs.

In this case you wouldn't want to spread out each user's Column Families
over thousands of nodes! You would want something like: RF=3, DF=10 i.e.
distribute each CF over 10 nodes, within those nodes replicate 3 times.

One implementation of DF would be to hash the CF name, and use the same
strategies defined for RF to choose the N nodes in DF=N.

Mime
View raw message