cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flachbart, Dirk (HP Software - TransactionVision)" <dirk.flachb...@hp.com>
Subject RE: Question about insert performance in multiple node cluster
Date Mon, 28 Feb 2011 22:05:18 GMT
Replication factor is set to 1, and I'm using ConsistencyLevel.ANY. And yep, I tried doubling
the threads from 16 to 32 when running with the second server, didn't make a difference.

Regarding the ring balancing - I assume it should be balanced. I'm using RandomPartitioner,
and the keys are generated by simply incrementing an Integer counter value, so they should
be spread fairly evenly across the two servers (at least that is my understanding based on
the Wiki documentation).


Regards,
Dirk


-----Original Message-----
From: Ryan King [mailto:ryan@twitter.com] 
Sent: Monday, February 28, 2011 12:30 PM
To: user@cassandra.apache.org
Cc: Flachbart, Dirk (HP Software - TransactionVision)
Subject: Re: Question about insert performance in multiple node cluster

On Mon, Feb 28, 2011 at 9:24 AM, Flachbart, Dirk (HP Software -
TransactionVision) <dirk.flachbart@hp.com> wrote:
> Hi,
>
>
>
> We are trying to use Cassandra for high-performance insertion of simple
> key/value records. I have set up Cassandra on two of my machines in my local
> network (Windows 2008 server), using pretty much the default configuration.
> I created a test driver in java (using thrift) which inserts a single 1K
> data column (keys are unique strings of integer values) with multiple
> threads. On each machine I am able to achieve around 9,000 inserts/sec when
> running the test driver with the local Cassandra server.
>
>
>
> Then I set up a cluster with both machines, and ran the same test again (the
> test driver is still local to one of the Cassandra nodes). Surprisingly I
> did not see any improvement in the insert performance, I got the same 9000
> inserts/sec as when running with a single node. I know that I shouldn't
> expect linear scaling to 18,000 operations/sec, but shouldn't I see at least
> some significant improvement? The CPU isn't fully loaded on either of the
> machines, and the network utilization is low too (1000 mbit network). Later
> on I also tested adding a third node, but that didn't improve anything
> either.
>
>
>
> I suspect I'm doing something wrong with setting up the cluster. The only
> changes I made on the second machine were:
>
>
>
> -          AutoBootstrap=true
>
> -          Setting 'Seed' to the IP of the other node
>
>
>
>
>
> Did I miss anything? Or am I simply wrong in expecting the throughput to
> scale when using multiple nodes?

What's your replication factor? Which consistency level are you using?
Is the ring evenly balanced? Did you double the number of client
threads when you added the second server?

-ryan

Mime
View raw message