cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Why is mutation stage increasing ??
Date Wed, 05 Oct 2011 20:00:44 GMT
Sounds like there is a lot going on. 

I'm going to assume the order you showed the HH stats in is the order of the nodes. I'm guessing
node 180 is node 2, but it would be easier if you could identify the nodes and identify the
stats for them. 

In no particular order:

* Have a heavily used My SQL box on one node is going to make your life *much* harder than
it needs to be. The HH stats looks really suspicious, nodes 1 and 2 running lots of HH and
node 2 not. It looks like node 2 is having a hard time keeping up, and i would guess my sql
may have something to do with that. Does the cassandra service on node 2 have the same memory
allocation as the other nodes ? Were the TP stats below that showed the flush stage blocking
from node 2 ? Whats the %util and queue-sz from iostat on node 2 compared to the other nodes?


* Ignore all the metrics about what is happening on node 2 and get mysql off there.   

* You are running a 16GB heap, that is at the upper range (some would say above the upper
range) of what is effective for java / cassandra. 8GB seems to be a popular sweet spot. Work
out why you are using so much memory and then try to get the heap size down to 8.

* Try to understand whats happening with memory usage. Do you have lots of CF's ? Lots of
row cache ?  Some very big rows ? There are several features in 0.8.6 (see the yaml) that
will kick in when memory usage is high, but you need to understand why it's high.

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 5/10/2011, at 11:36 PM, Philippe wrote:

> Followup, 
> I was mistaken in saying there weren't writes to the cluster. There's a process that's
doing a couple mutations per second.
> 
> I just restarted node #3 and found this message on node #1
>  INFO [HintedHandoff:1] 2011-10-05 12:25:08,173 HintedHandOffManager.java (line 314)
Endpoint /xx.xx.xx.180 died before hint delivery, aborting
> 
> Could HH have stuck the nodes on the receiving end ? Is there any way to throttle this
?
> If it can't be throttled and you confirm HH is a suspect I may simply disable it as I'm
running a repair 3 times a week (once per node) so I guess my cluster won't be too out of
sync.
> 
> Thanks
> 
> 2011/10/5 Philippe <watcherfr@gmail.com>
> Thanks for the quick responses.
> 
> @Yi
> Using Hector 0.8.0-1
> Hardware is : 
> AMD Opteron 4174 6x 2.30+ GHz
> 32 Go DDR3
> 1 Gbps Lossless
> 
> @aaron
> I'm running 0.8.6 on all nodes, straight from the debian packages.
> I get hinted handoffs from time to time because of flapping, I've planning to increase
the phi as per another thread but haven't yet.
> Here are the HH per node::
> HintedHandoff                     0         0            437         0              
  0
> HintedHandoff                     0         0              2         0              
  0
> 
> HintedHandoff                     0         0           1798         0              
  0
> Not seeing any iowait from my munin CPU graph, at least not more than the past couple
of weeks. There is a little more on the 2nd node because it's also holding a mysql database
that gets hit hard.
> Munin iostats graph shows an average 10-20kilo blocks/s read/write.
> 
> Nothing special was happening besides a weekly repair on node two starting yesterday
at 4am. That one failed with
> ERROR [AntiEntropySessions:5] 2011-10-04 04:03:56,676 AbstractCassandraDaemon.java (line
139) Fatal exception in thread Thread[AntiEntropySessions:5,5,RMI Runtime]
> java.lang.RuntimeException: java.io.IOException: Problem during repair session manual-repair-0d4125f0-aab7-403b-a083-7c19ef6579b1,
endpoint /xx.xx.xx.180 died
> Then the next planned repairs failed before starting
> INFO [AntiEntropySessions:8] 2011-10-04 04:03:57,137 AntiEntropyService.java (line 658)
Could not proceed on repair because a neighbor (/xx.xx.xx.180) is dead: manual-repair-5dc4c221-3a15-4031-9aa8-0931e41816cd
failed
> 
> Looking at the logs on that node shows no Exception. And I was about to say, "nothing
special happening at that time" except that it looks like at 4am, the GC started working hard
and got the heap down to 9GB and then it shot straight up to almost 16GB so I guess ParNew
couldn't keep up and ConcurrentMarkSweep had to step in and basically hang the server ? It
took another 2 minutes until I get the "Heap is 0.75 full" message, I get a lot of StatusLogger
messages before that.
> So it looks like computing the Merkle tree was very expensive this time... I wonder why
? Anything I can do to handle this ?
> 
> INFO [ScheduledTasks:1] 2011-10-04 04:03:12,874 GCInspector.java (line 122) GC for ParNew:
427 ms for 2 collections, 16227154488 used; max is 16838033408
>  INFO [GossipTasks:1] 2011-10-04 04:04:24,092 Gossiper.java (line 697) InetAddress /xx.xx.xx.97
is now dead.
>  INFO [ScheduledTasks:1] 2011-10-04 04:04:24,093 GCInspector.java (line 122) GC for ParNew:
26376 ms for 2 collections, 8832125416 used; max is 16838033408
> (not GC logs until)
>  INFO [ScheduledTasks:1] 2011-10-04 04:04:24,251 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
16250 ms for 3 collections, 9076209720 used; max is 16838033408
>  WARN [ScheduledTasks:1] 2011-10-04 04:06:52,777 GCInspector.java (line 143) Heap is
0.752707974197173 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will
now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this automatically
> 
> 
> 
> 2011/10/5 aaron morton <aaron@thelastpickle.com>
> Lots of hinted handoff can give you mutations…
> 
>> HintedHandoff                     0         0           1798         0          
      0
> 
> 
> 1798 is somewhat high. This is the HH tasks on this node though, can you see HH running
on other nodes in the cluster? What has been happening on this node ? 
> 
> HH is throttled to avoid this sort of thing, what version are you on ? 
> 
> Also looks like the disk IO could not keep up with the flushing….
> 
> FlushWriter                       0         0           5714         0              
499
>  
> You need to provide some more info on what was happening to nodes before hand. And check
the logs on all machines for errors etc. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 5/10/2011, at 9:52 PM, Yi Yang wrote:
> 
>> Well what client are you using? And can you give a hint to your node hardware?
>> 從我的 BlackBerry® 無線裝置
>> 
>> From: Philippe <watcherfr@gmail.com>
>> Date: Wed, 5 Oct 2011 10:33:21 +0200
>> To: user<user@cassandra.apache.org>
>> ReplyTo: user@cassandra.apache.org
>> Subject: Why is mutation stage increasing ??
>> 
>> Hello,
>> I have my 3-node, RF=3 cluster acting strangely. Can someone shed a light as to what
is going on ?
>> It was stuck for a couple of hours (all clients TimedOut). nodetool tpstats showed
huge increasing MutationStages (in the hundreds of thousands).
>> I restarted one node and it took a while to reply GB of commitlog. I've shutdown
all clients that write to the cluster and it's just weird
>> 
>> All nodes are still showing huge MutationStages including the new one and it's either
increasing or stable. The pending count is stuck at 32.
>> Compactionstats shows no compaction on 2 nodes and dozens of Scrub compactions (all
at 100%) on the 3rd one. This is a scrub I did last week when I encountered assertion errors.
>> Netstats shows no streams being exchanged at any node but each on is expecting a
few Responses.
>> 
>> Any ideas ?
>> Thanks
>> 
>> For example (increased to 567062 while I was writing this email)
>> Pool Name                    Active   Pending      Completed   Blocked  All time
blocked
>> ReadStage                         0         0    18372664517         0          
      0
>> RequestResponseStage              0         0    10731370183         0          
      0
>> MutationStage                    32    565879      295492216         0          
      0
>> ReadRepairStage                   0         0          23654         0          
      0
>> ReplicateOnWriteStage             0         0        7733659         0          
      0
>> GossipStage                       0         0        3502922         0          
      0
>> AntiEntropyStage                  0         0           1631         0          
      0
>> MigrationStage                    0         0              0         0          
      0
>> MemtablePostFlusher               0         0           5716         0          
      0
>> StreamStage                       0         0             10         0          
      0
>> FlushWriter                       0         0           5714         0          
    499
>> FILEUTILS-DELETE-POOL             0         0            773         0          
      0
>> MiscStage                         0         0           1266         0          
      0
>> FlushSorter                       0         0              0         0          
      0
>> AntiEntropySessions               0         0             18         0          
      0
>> InternalResponseStage             0         0              0         0          
      0
>> HintedHandoff                     0         0           1798         0          
      0
>> 
>> 
>> Mode: Normal
>> Not sending any streams.
>> Not receiving any streams.
>> Pool Name                    Active   Pending      Completed
>> Commands                        n/a         0     1223769753
>> Responses                       n/a         4     1627481305
>> 
> 
> 
> 


Mime
View raw message