Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of scode@scode.org designates
 74.125.82.44 as permitted sender)
MIME-Version: 1.0
Sender: scode@scode.org
In-Reply-To: <4F18F8EE.40608@rightscale.com>
References: <4F18F8EE.40608@rightscale.com>
Date: Thu, 19 Jan 2012 22:02:48 -0800
Message-ID: 
 <CAO5xsd0q7E+NWv+JiBgeO3m_Eh6QCXcbpQHbTgYYyF8cyk0MTQ@mail.gmail.com>
Subject: Re: ideal cluster size
From: Peter Schuller <peter.schuller@infidyne.com>
To: user@cassandra.apache.org
Content-Type: text/plain; charset=UTF-8

> We're embarking on a project where we estimate we will need on the order
> of 100 cassandra nodes. The data set is perfectly partitionable, meaning
> we have no queries that need to have access to all the data at once. We
> expect to run with RF=2 or =3. Is there some notion of ideal cluster
> size? Or perhaps asked differently, would it be easier to run one large
> cluster or would it be easier to run a bunch of, say, 16 node clusters?
> Everything we've done to date has fit into 4-5 node clusters.

Certain things certainly becomes harder with many nodes just due to
the shear amount; increased need to automate administrative tasks,
etc. But mostly, this would apply equally to e.g. 10 clusters of 10
nodes, as it does to one cluster of 100 nodes.

I'd prefer running one cluster unless there is a specific reason to do
otherwise, just because it means you have "one" thing to keep track of
both mentally and in terms of e.g. monitoring/alerting instead of
having another level of grouping applied to your hosts.

I can't think of significant benefits to small clusters that still
hold true when you have many of them, as opposed to a correspondingly
big single cluster.

It is probably more useful to try to select hardware such that you
have a greater number of smaller nodes, than it is to focus on node
count (although once you start reaching the "few hundreds" level
you're entering territory of less actual real-life production
testing).

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)