We have the same setup:  one keyspace per client, and currently about 300 keyspaces.  nodetool repair takes a long time, 4 hours with -pr on a single node.  We have a 4 node cluster with about 10 gb per node.  Unfortunately, we haven't been keeping track of the running time as keyspaces, or load, increases.

On Wed, Nov 20, 2013 at 6:53 AM, John Pyeatt <john.pyeatt@singlewire.com> wrote:
We have an application that has been designed to use potentially 100s of keyspaces (one for each company).

One thing we are noticing is that nodetool repair across all of the keyspaces seems to increase linearly based on the number of keyspaces. For example, if we have a 6 node ec2 (m1.large) cluster across 3 Availability Zones and create 20 keyspaces a nodetool repair -pr on one node takes 3 hours even with no data in any of the keyspaces. If I bump that up to 40 keyspaces it takes 6 hours.

Is this the behaviour you would expect?

Is there anything you can think of (short of redesigning the cluster to limit keyspaces) to increase the performance of the nodetool repairs?

My obvious concern is that as this application grows and we get more companies using our it we will eventually have too many keyspaces to perform repairs on the cluster.

