We had issues where the number of CF families that were being flushed would align and then block writes for a very brief period. If that happened when a bunch of writes came in, we'd see a spike in Mutation drops.
Check nodetool tpstats for FlushWriter all time blocked.On Thu, Dec 19, 2013 at 7:12 AM, Alexander Shutyaev <email@example.com> wrote:
Hi all!We've had a problem with cassandra recently. We had 2 one-minute periods when we got a lot of timeouts on the client side (the only timeouts during 9 days we are using cassandra in production). In the logs we've found corresponding messages saying something about MUTATION messages dropped.Now, the official faq  says that this is an indicator that the load is too high. We've checked our monitoring and found out that 1-minute average cpu load had a local peak at the time of the problem, but it was like 0.8 against 0.2 usual which I guess is nothing for a 2 core virtual machine. We've also checked java threads - there was no peak there and their count was reasonable ~240-250.Can anyone give us a hint - what should we monitor to see this "high load" and what should we tune to make it acceptable?Thanks in advance,Alexander
Ken Hancock | System Architect, Advanced Advertising
50 Nagog Park
Acton, Massachusetts 01720
firstname.lastname@example.org | www.schange.com | NASDAQ:SEAC
Office: +1 (978) 889-3329 | email@example.com | hancockks | hancockks
This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.