incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: how to debug slowdowns from these log snippets-more info
Date Wed, 20 Feb 2013 14:16:30 GMT
Here is the printout before that log which is probably important as wellŠ..

 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,375 GCInspector.java (line
122) GC for ConcurrentMarkSweep: 3618 ms for 2 collections, 7038159096
used; max is 8243904512
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,375 StatusLogger.java (line
57) Pool Name                    Active   Pending   Blocked
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,375 StatusLogger.java (line
72) ReadStage                        11       264         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,376 StatusLogger.java (line
72) RequestResponseStage              0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,376 StatusLogger.java (line
72) ReadRepairStage                   0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,376 StatusLogger.java (line
72) MutationStage                    12        88         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,376 StatusLogger.java (line
72) ReplicateOnWriteStage             0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,376 StatusLogger.java (line
72) GossipStage                       1         7         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,376 StatusLogger.java (line
72) AntiEntropyStage                  0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,377 StatusLogger.java (line
72) MigrationStage                    0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,377 StatusLogger.java (line
72) StreamStage                       0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,377 StatusLogger.java (line
72) MemtablePostFlusher               0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,377 StatusLogger.java (line
72) FlushWriter                       0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,377 StatusLogger.java (line
72) MiscStage                         0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,377 StatusLogger.java (line
72) commitlog_archiver                0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
72) InternalResponseStage             0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
72) HintedHandoff                     0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
77) CompactionManager                 4         5
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
89) MessagingService                n/a    10,127
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
99) Cache Type                     Size                 Capacity
    KeysToSave     
Provider
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
100) KeyCache                    1310719                  1310719
            all
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
106) RowCache                          0                        0
            all    
org.apache.cassandra.cache.SerializingCacheProvider
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,378 StatusLogger.java (line
113) ColumnFamily                Memtable ops,data
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.ServersThatLog                   0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.DecimalIndice             971,885311
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.EntityGroupXref                  0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.IntegerIndice            3256,620947
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.KeyToTableName                   0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.SdiTable                         0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.DboTableMeta               339,41537
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,379 StatusLogger.java (line
116) databus5.nreldata                4528,3261763
 INFO [GossipTasks:1] 2013-02-20 07:14:00,375 Gossiper.java (line 830)
InetAddress /10.10.43.5 is now dead.
 INFO [GossipTasks:1] 2013-02-20 07:14:00,380 Gossiper.java (line 830)
InetAddress /10.10.43.8 is now dead.
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,380 StatusLogger.java (line
116) databus5.DboColumnMeta             809,112672
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,380 StatusLogger.java (line
116) databus5.DboDatabaseMeta                  0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,381 StatusLogger.java (line
116) databus5.WebNodeDbo                 2310,4470
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,381 StatusLogger.java (line
116) databus5.User                             0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,381 StatusLogger.java (line
116) databus5.RoleMapping                      0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,381 StatusLogger.java (line
116) databus5.LogEvent                         0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,381 StatusLogger.java (line
116) databus5.SecureResourceGroupXref                 0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,382 StatusLogger.java (line
116) databus5.SecurityGroup                    0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,382 StatusLogger.java (line
116) databus5.StreamAggregation                 0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,382 StatusLogger.java (line
116) databus5.AppProperty                      0,0
 INFO [GossipTasks:1] 2013-02-20 07:14:00,382 Gossiper.java (line 830)
InetAddress /10.10.43.3 is now dead.
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,382 StatusLogger.java (line
116) databus5.StringIndice               290,27231
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,382 StatusLogger.java (line
116) databus5.MonitorDbo                   75,2983
 INFO [GossipTasks:1] 2013-02-20 07:14:00,382 Gossiper.java (line 830)
InetAddress /10.10.43.7 is now dead.
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,382 StatusLogger.java (line
116) databus5.SdiColumn                        0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.NodeIdInfo                         0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.IndexInfo                          0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.LocationInfo                       0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.Versions                           0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.schema_keyspaces                   0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.Migrations                         0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.schema_columnfamilies                 0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,383 StatusLogger.java (line
116) system.schema_columns                     0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,384 StatusLogger.java (line
116) system.HintsColumnFamily                  0,0
 INFO [ScheduledTasks:1] 2013-02-20 07:14:00,384 StatusLogger.java (line
116) system.Schema                             0,0
 WARN [ScheduledTasks:1] 2013-02-20 07:14:00,384 GCInspector.java (line
145) Heap is 0.8537409774403752 full.  You may need to reduce memtable
and/or cache sizes.  Cassandra will now flush up to the two largest
memtables to free up memory.  Adjust flush_largest_memtables_at threshold
in cassandra.yaml if you don't want Cassandra to do this automatically
 WARN [ScheduledTasks:1] 2013-02-20 07:14:00,384 StorageService.java (line
2855) Flushing CFS(Keyspace='databus5', ColumnFamily='nreldata') to
relieve memory pressure



On 2/20/13 7:10 AM, "Hiller, Dean" <Dean.Hiller@nrel.gov> wrote:

>Cassandra version 1.1.4
>
>I captured all the logs of node causing timeouts (in a 6 node cluster).
>We seem to get these slowdowns every once in a while and it causes our
>whole website to be 10 times slower.  Since PlayOrm actually logs the
>rows being accessed we know exactly which row the timeout occurred on and
>asytanax logged the node which was nice.
>
>Each time we get this website being 10 times slower, this is in the logs
>
> WARN [ScheduledTasks:1] 2013-02-20 06:52:15,929 GCInspector.java (line
>145) Heap is 0.8532751700072093 full.  You may need to reduce memtable
>and/or cache sizes.  Cassandra will now flush up to the two largest
>memtables to free up memory.  Adjust flush_largest_memtables_at threshold
>in cassandra.yaml if you don't want Cassandra to do this automatically
> WARN [ScheduledTasks:1] 2013-02-20 06:52:15,929 StorageService.java
>(line 2855) Flushing CFS(Keyspace='databus5',
>ColumnFamily='IntegerIndice') to relieve memory pressure
>
>I seem to remember reading about we should turn off swap which I have not
>gotten around to at this point :(Šwill that help in that the node is
>taken out of the cluster or?  As you can see from top, this node has been
>running a 5,000,000 hours below(What's up with thatŠ.maybe the h does not
>actually mean hours) though I know it has been running for months.
>
>Here is a top commandŠ.
>
>top - 07:02:37 up 148 days, 21:11,  1 user,  load average: 1.10, 1.09,
>1.09
>Tasks: 379 total,   2 running, 377 sleeping,   0 stopped,   0 zombie
>Cpu(s): 15.9%us,  1.0%sy,  2.9%ni, 79.9%id,  0.1%wa,  0.0%hi,  0.2%si,
>0.0%st
>Mem:  32854680k total, 32582688k used,   271992k free,       44k buffers
>Swap: 33554424k total,    52180k used, 33502244k free, 22653528k cached
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  701 cassandr  20   0 63.3g  10g 1.9g S 838.4 32.2  5125794h java
>
>What can I dump to see why we need to "relieve memory pressure"?
>
>I ran jmap like so but it can't attach for some reason
>
>[cassandra@a4 ~]$ jmap 701 -heap:format=b
>Attaching to core -heap:format=b from executable 701, please wait...
>Error attaching to core file: Can't attach to the core file
>
>Jstack is not helping much either Š.
>
>[cassandra@a4 ~]$ jstack -l 701 > threads.txt
>701: Unable to open socket file: target process not responding or HotSpot
>VM not loaded
>The -F option can be used when the target process is not responding
>[cassandra@a4 ~]$ jstack -F -l 701 > threads.txt
>Attaching to process ID 701, please wait...
>Debugger attached successfully.
>Server compiler detected.
>JVM version is 20.7-b02
>java.lang.NullPointerException
>at 
>sun.jvm.hotspot.oops.InstanceKlass.computeSubtypeOf(InstanceKlass.java:426
>)
>at sun.jvm.hotspot.oops.Klass.isSubtypeOf(Klass.java:137)
>at sun.jvm.hotspot.oops.Oop.isA(Oop.java:100)
>at 
>sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:93)
>at 
>sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39)
>at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:52)
>at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)
>at sun.jvm.hotspot.tools.JStack.run(JStack.java:60)
>at sun.jvm.hotspot.tools.Tool.start(Tool.java:221)
>at sun.jvm.hotspot.tools.JStack.main(JStack.java:86)
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>at 
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>39)
>at 
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>pl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at sun.tools.jstack.JStack.runJStackTool(JStack.java:118)
>at sun.tools.jstack.JStack.main(JStack.java:84)
>
>Thanks,
>Dean


Mime
View raw message