incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: (better info)any way to get the #writes/second, reads per second
Date Tue, 14 May 2013 19:01:12 GMT
Yes, we finally got to the bottom of it.  There was some code in our web tier so although our
client load had not changed, the webtier was reading a lot more than usual.  It was a good
experience still to debug something like this and get to the bottom of it.  Always seem to
be learning a new corner of the system.

It just happened someone started using this new feature that slammed our servers due to a
bug when we added our first node to the cluster.  We will be adding the second node tomorrow
as things look great on version 1.2.2.


From: aaron morton <<>>
Reply-To: "<>" <<>>
Date: Tuesday, May 14, 2013 12:44 PM
To: "<>" <<>>
Subject: Re: (better info)any way to get the #writes/second, reads per second

Any reason why cassandra might be reading a lot from the data disks(not
the commit log disk) more than usual?
On the new node or all nodes ?

Maybe cold Key Cache or cold memmapped files due to a change in the data distribution ?

Did it settle down ?


Aaron Morton
Freelance Cassandra Consultant
New Zealand


On 14/05/2013, at 5:06 AM, "Hiller, Dean" <<>>

Ah, okay iostat -x NEEDS a number like "iostat -x 5" works better(first
one always shows 4% util while second one shows 100%).  Iotop seems a bit
better here.

So we know that since we added our new node, we are slammed with read and
no one is running compations according to "clush -g datanodes nodetool

Any reason why cassandra might be reading a lot from the data disks(not
the commit log disk) more than usual?


On 5/13/13 10:46 AM, "Hiller, Dean" <<>>

We running a pretty consistent load on our cluster and added a new node
to a 6 node cluster Friday(QA worked great, but production not so much).
One mistake that was made was starting up the new node, then disabling
the firewall :( which allowed nodes to discover it BEFORE the node
bootstrapped itself.  We shutdown the node and booted him up and he
bootstrapped himself streaming all the data in.

After that though, all the ndoes have really really high load numbers
now.  We are trying to figure out what is going on still.

Is there any way to get the number of reads/second and writes/second
through JMX or something?  The only way I can see of on doing this is
manually calculating it by timing the read count and dividing by my
manual stop watches start/stop times(timerange).

Also, while my load is load average: 20.31, 19.10, 19.72 , what does a
normal iostat look like?  My iostat await time is 13.66 ms which I think
is kind of bad, but not that bad to cause a load of 20.31?

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util
sda               0.02     0.07   11.70    1.96  1353.67   702.88
150.58     0.19   13.66   3.61   4.93
sdb               0.00     0.02    0.11    0.46    20.72    97.54
206.70     0.00    1.33   0.67   0.04


View raw message