incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood" <stu.h...@rackspace.com>
Subject Re: Cassandra benchmarking on Rackspace Cloud
Date Mon, 19 Jul 2010 17:22:46 GMT
How many physical client machines are running stress.py?

-----Original Message-----
From: "David Schoonover" <david.schoonover@gmail.com>
Sent: Monday, July 19, 2010 12:11pm
To: user@cassandra.apache.org
Subject: Re: Cassandra benchmarking on Rackspace Cloud

Hello all, I'm Oren's partner in crime on all this. I've got a few more numbers to add.

In an effort to eliminate everything but the scaling issue, I set up a cluster on dedicated
hardware (non-virtualized; 8-core, 16G RAM). 

No data was loaded into Cassandra -- 100% of requests were misses. This is, so far as we can
reason about the problem, as fast as the database can perform; disk is out of the picture,
and the hardware is certainly more than sufficient.

nodes	reads/sec
1	53,000
2	37,000
4	37,000

I ran this test previously on the cloud, with similar results:

nodes	reads/sec
1	24,000
2	21,000
3	21,000
4	21,000
5	21,000
6	21,000

In fact, I ran it twice out of disbelief (on different nodes the second time) to essentially
identical results. 

Other Notes:
 - stress.py was run in both random and gaussian mode; there was no difference. 
 - Runs were 10+ minutes (where the above number represents an average excluding the beginning
and the end of the run). 
 - Supplied node lists covered all boxes in the cluster. 
 - Data and commitlog directories were deleted between each run.
 - Tokens were evenly spaced across the ring, and changed to match cluster size before each
run.

If anyone has explanations or suggestions, they would be quite welcome. This is surprising
to say the least.

Cheers,

Dave



On Jul 19, 2010, at 11:42 AM, Stu Hood wrote:

> Hey Oren,
> 
> The Cloud Servers REST API returns a "hostId" for each server that indicates which physical
host you are on: I'm not sure if you can see it from the control panel, but a quick curl session
should get you the answer.
> 
> Thanks,
> Stu
> 
> -----Original Message-----
> From: "Oren Benjamin" <oren@clearspring.com>
> Sent: Monday, July 19, 2010 10:30am
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: Cassandra benchmarking on Rackspace Cloud
> 
> Certainly I'm using multiple cloud servers for the multiple client tests.  Whether or
not they are resident on the same physical machine, I just don't know.
> 
>   -- Oren
> 
> On Jul 18, 2010, at 11:35 PM, Brandon Williams wrote:
> 
> On Sun, Jul 18, 2010 at 8:45 PM, Oren Benjamin <oren@clearspring.com<mailto:oren@clearspring.com>>
wrote:
> Thanks for the info.  Very helpful in validating what I've been seeing.  As for the scaling
limit...
> 
>>> The above was single node testing.  I'd expect to be able to add nodes and scale
throughput.  Unfortunately, I seem to be running into a cap of 21,000 reads/s regardless of
the number of nodes in the cluster.
>> 
>> This is what I would expect if a single machine is handling all the
>> Thrift requests.  Are you spreading the client connections to all the
>> machines?
> 
> Yes - in all tests I add all nodes in the cluster to the --nodes list.  The client requests
are in fact being dispersed among all the nodes as evidenced by the intermittent TimedOutExceptions
in the log which show up against the various nodes in the input list.  Could it be a result
of all the virtual nodes being hosted on the same physical hardware?  Am I running into some
connection limit?  I don't see anything pegged in the JMX stats.
> 
> It's unclear if you're using multiple client machines for stress.py or not, a limitation
of 24k/21k for a single quad-proc machine is normal in my experience.
> 
> -Brandon
> 
> 
> 




Mime
View raw message