cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: GC Exceptions and cluster nodes are dying
Date Wed, 01 Dec 2010 20:27:26 GMT
Running nodes with different JVM heap sizes would not be recommended practice, for many reasons.
Nor would I recommend running them with all the memory the machine has, it will just lead
to the OS swapping the JVM out to disk and considerable slow things down.

I would suggest a heap size of 1.5 or 2.0 GB for each node, and have a read of the JVM Heap
Size section here http://wiki.apache.org/cassandra/MemtableThresholds . AFAIK the logs are
showing your cluster was under heavy GC pressure. 

Finally, the ActiveCount error message was a known issue in beta 2. Treat yourself and try
RC1 :)
http://www.mail-archive.com/user@cassandra.apache.org/msg06298.html

Aaron



On 02 Dec, 2010,at 12:33 AM, asakka <amin.sakka@novapost.fr> wrote:


Hello,

I'm making some tests on a data model with 3 CF and 1 SCF, I want to start
by inserting 1 million rows (my target is to have 1billion rows) . 
I have three nodes cluster (I'm using the same machines with 3GB of RAM
each , intel core2 duo 1,6GHZ), RF = 2, CL = 1, HEAPSIZE of the seed = 3GO
(it was 1.5GO, I've doubled it to avoid the heap size exception I had) , the
other two nodes are 1.5GO. 

I am using cassandra (V0.7.0-beta2) and Hector (V0.7.0.18) . I'm making
insertion in batch mode using hector Mutator.
My disk_access_mode is standard.
I reduced also my memtable_throughput_in_mb to 64, but the problem persists
and I have the following exception :
I want to know if it is a configuration or hardware problem ?

INFO [Timer-0] 2010-12-01 10:34:42,124 Gossiper.java (line 196) InetAddress
/10.0.100.215 is now dead.
INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,188 Gossiper.java (line 594) Node
/10.0.100.215 has restarted, now UP again
INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,189 StorageService.java (line
643) Node /10.0.100.215 state jump to normal
INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,189 StorageService.java (line
650) Will not change my token ownership to /10.0.100.215
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:34:44,189
HintedHandOffManager.java (line 196) Started hinted handoff for endpoint
/10.0.100.215
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:34:44,189
HintedHandOffManager.java (line 252) Finished hinted handoff of 0 rows to
endpoint /10.0.100.215
INFO [GC inspection] 2010-12-01 10:40:29,141 GCInspector.java (line 129) GC
for ParNew: 750 ms, 14693208 reclaimed leaving 2140055192 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:30,280 GCInspector.java (line 129) GC
for ParNew: 445 ms, 17042288 reclaimed leaving 2178211008 used; max is
3355312128
INFO [WRITE-/10.0.100.214] 2010-12-01 10:40:31,552
OutboundTcpConnection.java (line 115) error writing to /10.0.100.214
INFO [GC inspection] 2010-12-01 10:40:32,280 GCInspector.java (line 129) GC
for ParNew: 211 ms, 25550568 reclaimed leaving 2235227312 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:34,320 GCInspector.java (line 129) GC
for ParNew: 290 ms, 26512896 reclaimed leaving 2277013184 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:35,950 GCInspector.java (line 129) GC
for ParNew: 506 ms, 24319976 reclaimed leaving 2303739672 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:37,202 GCInspector.java (line 129) GC
for ParNew: 462 ms, 31759008 reclaimed leaving 2306914712 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:42,629 GCInspector.java (line 129) GC
for ParNew: 445 ms, 14769312 reclaimed leaving 2327064920 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:43,969 GCInspector.java (line 129) GC
for ParNew: 720 ms, 14804208 reclaimed leaving 2366434112 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:45,372 GCInspector.java (line 129) GC
for ParNew: 325 ms, 23112128 reclaimed leaving 2421032952 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:47,843 GCInspector.java (line 129) GC
for ParNew: 801 ms, 26014296 reclaimed leaving 2474278880 used; max is
3355312128
INFO [Timer-0] 2010-12-01 10:41:18,451 Gossiper.java (line 196) InetAddress
/10.0.100.215 is now dead.
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:41:19,362
HintedHandOffManager.java (line 196) Started hinted handoff for endpoint
/10.0.100.215
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:41:19,975
HintedHandOffManager.java (line 252) Finished hinted handoff of 0 rows to
endpoint /10.0.100.215
INFO [GOSSIP_STAGE:1] 2010-12-01 10:41:19,506 Gossiper.java (line 580)
InetAddress /10.0.100.215 is now UP
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:28,873 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-20-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:28,952 SSTablejava (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-148-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,053 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-7-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,163 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-12-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,274 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-13-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,407 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-13-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,513 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-17-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,545 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-7-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,577 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-146-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,776 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-2-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,784 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-145-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,882 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-19-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,883 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-147-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,884 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-14-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,886 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-5-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,887 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-7-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,887 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-8-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,888 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-16-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,917 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-3-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,918 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-6-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,918 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-15-<>
INFO [GC inspection] 2010-12-01 10:51:07,737 GCInspector.java (line 129) GC
for ParNew: 496 ms, 36105328 reclaimed leaving 274154728 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:51:43,742 GCInspector.java (line 129) GC
for ParNew: 3386 ms, 12099384 reclaimed leaving 297145056 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:51:45,487 GCInspector.java (line 150)
Pool Name Active Pending
INFO [GC inspection] 2010-12-01 10:51:45,706 GCInspector.java (line 156)
MIGRATION_STAGE 0 0
INFO [GC inspection] 2010-12-01 10:51:45,716 GCInspector.java (line 156)
GOSSIP_STAGE 0 0
ERROR [GC inspection] 2010-12-01 10:51:46,313 AbstractCassandraDaemon.java
(line 88) Fatal exception in thread Thread[GC inspection,5,main]
java.lang.reflect.UndeclaredThrowableException
at $Proxy1.getActiveCount(Unknown Source)
at
org.apache.cassandra.service.GCInspector.logThreadPoolStats(GCInspector.java:156)
at
org.apache.cassandra.service.GCInspector.logIntervalGCStats(GCInspector.java:136)
at org.apache.cassandra.service.GCInspector.access$000(GCInspector.java:39)
at org.apache.cassandra.service.GCInspector$1.run(GCInspector.java:93)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Caused by: javax.management.AttributeNotFoundException: No such attribute:
ActiveCount
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:63)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
at
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:263)
... 7 more

-- 
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/GC-Exceptions-and-cluster-nodes-are-dying-tp5791496p5791496.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message