On Mon, Aug 2, 2010 at 2:39 PM, Aaron Morton <aaron@thelastpickle.com> wrote:
You may need to provide some more information on how many reads your sending to the cluster. Also...

How many nodes do you have in the cluster ?

We have a cluster of 4 nodes.
When you are seeing high response times on one node, what's the load like on the others ?

They are low and until recently when that node was removed performance would increase, but as of today the problem appeared to just move to another node once the original faulty node was removed.
Is the data load evenly distributed around the cluster ?

No it is not it look like this:

Address       Status     Load          Range                                      Ring
                                       153186065170709351569621555259205461067    Up         60.6 GB       23543694856340775179323589033850348191     |<--|    Up         58.67 GB      64044280785277646901574566535858757214     |   |    Down       76.27 GB      145455238521487150744455174232451506694    |   |    Up         67.45 GB      153186065170709351569621555259205461067    |-->|

the down node is the original culprit.  then once that was down it moved to  Or setup is using rackaware with the 2 nodes in each switch.  we tried to use 2 nics one for thrift and one for gossip but couldnt get that working.  so now we just use on nic for all traffic.

Are your clients connecting to different nodes in the cluster ?

Yes we use Pelops and have all 4 nodes in the pool
Perhaps that node is somehow out of sync with the others...
Dont understand what you mean?
Anything odd happened in the cluster recently, such as one node going down ?
Yes the node is a test server so it has gone down to update jvm and storage-conf setting but only for a short amount of time.
When was the last time you ran repair?
We just ran it today and it didn't have any change.  Almost immediately the row-read-start goes over 4000
heres what iostats looks like:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.34    0.00    1.00   33.93    0.00   61.73

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb             335.50     0.00 70.50  0.00  3248.00     0.00    46.07     0.78   11.01   7.40  52.20
sdc             330.00     0.00 70.00  0.50  3180.00     4.00    45.16     2.10   29.65  13.09  92.30
sdd             310.00     0.00 93.50  0.50  3040.00     8.00    32.43    33.78  350.13  10.64 100.05
dm-0              0.00     0.00 97.00  0.50  9712.00     4.00    99.65    38.28  384.30  10.26 100.05

Drives are 3-1TB (sdb - sdd) drives stripped using lvm raid 0 and 1TB (sda) for commit log 

Anything else i can provide that might help diagnose.



On 03 Aug, 2010,at 06:47 AM, Artie Copeland <yeslinux.now@gmail.com> wrote:

i have a question on what are the signs from cassandra that new nodes should be added to the cluster.  We are currently seeing long read times from the one node that has about 70GB of data with 60GB in one column family.  we are using a replication factor of 3.  I have tracked down the slow to occur when either row-read-stage or message-deserializer-pool is high like atleast 4000.  my systems are 16core, 3 TB, 48GB mem servers.  we would like to be able to use more of the server than just 70GB.

The system is a realtime system that needs to scale quite large.  Our current heap size is 25GB and are getting atleast 50% row cache hit rates.  Does it seem strange that cassandra is not able to handle the work load?  We perform multislice gets when reading similar to twissandra does.  this is to cut down on the network ops.  Looking at iostat it doesnt appear to have alot of queued reads.

What are others seeing when they have to add new nodes?  What data sizes are they seeing?  This is needed so we can plan our growth and server purchase strategy.