incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: Linear scalability problems
Date Fri, 05 Apr 2013 12:51:37 GMT
If you double your nodes, you should be doubling your webservers too(that is if you are trying
to prove it scales linearly).  We had to spend time finding the correct ratio for our application
(it ended up being 19 webservers to 20 data nodes so now just assume 1 to 1…..you can use
amazon to find that info for very cheap.

Dean

From: Anand Somani <meatforums@gmail.com<mailto:meatforums@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, April 4, 2013 1:05 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Linear scalability problems

RF=3.

On Thu, Apr 4, 2013 at 7:08 AM, Cem Cayiroglu <cayiroglu@gmail.com<mailto:cayiroglu@gmail.com>>
wrote:
What was the RF before adding nodes?

Sent from my iPhone

On 04 Apr 2013, at 15:12, Anand Somani <meatforums@gmail.com<mailto:meatforums@gmail.com>>
wrote:

We are using a single process with multiple threads, will look at client side delays.

Thanks

On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs <tyler@datastax.com<mailto:tyler@datastax.com>>
wrote:
If I had to guess, I would say that your client is the bottleneck, not the cluster.  Are you
inserting data with multiple threads or processes?


On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani <meatforums@gmail.com<mailto:meatforums@gmail.com>>
wrote:
Hi,

I am running some tests trying to scale out our application from using a 3 node cluster to
6 node cluster. The thing I observed is that when using a 3 node cluster I was able to handle
abt 41 req/second, so I added 3 more nodes thinking it should close to double, but instead
it only goes upto bat 47 req/second!! I am doing something wrong and it is not obvious, so
wanted some help in what stats could/should I monitor to tell me things like if a node has
more requests or if the load distribution is not random enough?

Note I am using direct thrift (old code base) and cassandra 1.1.6. The data model is for storing
blobs (split across columns) and has around 6 CF, RF=3 and all operations are at quorum. Also
at the end of the run nodetool ring reports the same data size.

Thanks
Anand



--
Tyler Hobbs
DataStax<http://datastax.com/>



Mime
View raw message