cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Kjellman <>
Subject Re: counters + replication = awful performance?
Date Tue, 27 Nov 2012 18:02:42 GMT
Are you writing with QUORUM consistency or ONE?

On 11/27/12 9:52 AM, "Sergey Olefir" <> wrote:

>Hi Juan,
>thanks for your input!
>In my case, however, I doubt this is the case -- clients are able to push
>many more updates than I need to saturate replication_factor=2 case (e.g.
>I'm doing as many as 6x more increments when testing 2-node cluster with
>replication_factor=1), so bandwidth between clients and server should be
>Bandwidth between nodes in the cluster should also be quite sufficient
>they are both in the same DC. But it is something to check, thanks!
>Best regards,
>Juan Valencia wrote
>> Hi Sergey,
>> I know I've had similar issues with counters which were bottle-necked by
>> network throughput.  You might be seeing a problem with throughput
>> the clients and Cass or between the two Cass nodes.  It might not be
>> case, but that was what happened to me :-)
>> Juan
>> On Tue, Nov 27, 2012 at 8:48 AM, Sergey Olefir &lt;
>> solf.lists@
>> &gt; wrote:
>>> Hi,
>>> I have a serious problem with counters performance and I can't seem to
>>> figure it out.
>>> Basically I'm building a system for accumulating some statistics "on
>>> fly" via Cassandra distributed counters. For this I need counter
>>> to
>>> work "really fast" and herein lies my problem -- as soon as I enable
>>> replication_factor = 2, the performance goes down the drain. This
>>> in
>>> my tests using both 1.0.x and 1.1.6.
>>> Let me elaborate:
>>> I have two boxes (virtual servers on top of physical servers rented
>>> specifically for this purpose, i.e. it's not a cloud, nor it is shared;
>>> virtual servers are managed by our admins as a way to limit damage as I
>>> suppose :)). Cassandra partitioner is set to ByteOrderedPartitioner
>>> because
>>> I want to be able to do some range queries.
>>> First, I set up Cassandra individually on each box (not in a cluster)
>>> test counter increments performance (exclusively increments, no reads).
>>> For
>>> tests I use code that is intended to somewhat resemble the expected
>>> pattern -- particularly the majority of increments create new counters
>>> with
>>> some updating (adding) to already existing counters. In this test each
>>> single node exhibits respectable performance - something on the order
>>> 70k
>>> (seventy thousand) increments per second.
>>> I then join both of these nodes into single cluster (using SimpleSnitch
>>> and
>>> SimpleStrategy, nothing fancy yet). I then run the same test using
>>> replication_factor=1. The performance is on the order of 120k
>>> per
>>> second -- which seems to be a reasonable increase over the single node
>>> performance.
>>> HOWEVER I then rerun the same test on the two-node cluster using
>>> replication_factor=2 -- which is the least I'll need for actual
>>> production
>>> for redundancy purposes. And the performance I get is absolutely
>>> --
>>> much, MUCH worse than even single-node performance -- something on the
>>> order
>>> of less than 25k increments per second. In addition to clients not
>>> able to push updates fast enough, I also see a lot of 'messages
>>> messages in the Cassandra log under this load.
>>> Could anyone advise what could be causing such drastic performance drop
>>> under replication_factor=2? I was expecting something on the order of
>>> single-node performance, not approximately 3x less.
>>> When testing replication_factor=2 on 1.1.6 I can see that CPU usage
>>> through the roof. On 1.0.x I think it looked more like disk overload,
>>> I'm not sure (being on virtual server I apparently can't see true
>>> iostats).
>>> I do have Cassandra data on a separate disk, commit log and cache are
>>> currently on the same disk as the system. I experimented with commit
>>> flush modes and even with disabling commit log at all -- but it doesn't
>>> seem
>>> to have noticeable impact on the performance when under
>>> replication_factor=2.
>>> Any suggestions and hints will be much appreciated :) And please let me
>>> know
>>> if I need to share additional information about the configuration I'm
>>> running on.
>>> Best regards,
>>> Sergey
>>> --
>>> View this message in context:
>>> Sent from the 
>> cassandra-user@.apache
>>  mailing list archive at
>> -- 
>> Learn More:  SQI (Social Quality Index) - A Universal Measure of Social
>> Quality
>View this message in context:
>Sent from the mailing list archive at

'Like' us on Facebook for exclusive content and other resources on all Barracuda Networks

View raw message