Cut your memtable thresholds (throughput and ops) in half.  See "describe keyspaces" and "update keyspace" in the cli.

On Wed, Dec 8, 2010 at 9:18 AM, Amin Sakka, Novapost <amin.sakka@novapost.fr> wrote:
Thanks for your answer Aaron,

I'm now on the RC1, I have no longer the ActiveCount error, however my nodes still dying under bulk insertion.

I have modified my nodes configuration (all of them has now 2GB Heap size). The nodes still under heavy pressure and they dies after a random timeout (sometimes after 10 minutes of insertion
and sometimes after 50 minutes).
I want to point that I'm inserting rows in 4 different columns families at the same time and that the rows size is too little (few KiloBytes).
I've attached here my cassandra.yaml configuration file.
Can you help me please to solve this issue?

Thanks!

Here is some of my log output: 

DEBUG [MutationStage:27] 2010-12-08 15:31:19,214 RowMutationVerbHandler.java (line 78) RowMutation(keyspace='SAE', key='6163636f756e743936353a726566393230', modifications=[ColumnFamily(Document [6465736372697074696f6e:false:49@1291818633293000,646f63756d656e744964:false:36@1291818633293000,7265666572656e6365:false:17@1291818633293000,])]) applied.  Sending response to 714546@/10.0.100.94
DEBUG [MutationStage:7] 2010-12-08 15:31:19,214 RowMutationVerbHandler.java (line 54) Applying RowMutation(keyspace='SAE', key='6163636f756e743938373a726566313835', modifications=[ColumnFamily(Document [6465736372697074696f6e:false:49@1291818633524000,646f63756d656e744964:false:36@1291818633524000,7265666572656e6365:false:17@1291818633524000,])])
DEBUG [MutationStage:7] 2010-12-08 15:31:19,214 Table.java (line 378) applying mutation of row 6163636f756e743938373a726566313835
DEBUG [MutationStage:7] 2010-12-08 15:31:19,214 RowMutationVerbHandler.java (line 78) RowMutation(keyspace='SAE', key='6163636f756e743938373a726566313835', modifications=[ColumnFamily(Document [6465736372697074696f6e:false:49@1291818633524000,646f63756d656e744964:false:36@1291818633524000,7265666572656e6365:false:17@1291818633524000,])]) applied.  Sending response to 714547@/10.0.100.94
DEBUG [ScheduledTasks:1] 2010-12-08 15:31:19,373 GCInspector.java (line 135) GC for ParNew: 13 ms, 34456296 reclaimed leaving 1465917704 used; max is 2256404480





2010/12/1 Aaron Morton <aaron@thelastpickle.com>

Running nodes with different JVM heap sizes would not be recommended practice, for many reasons. Nor would I recommend running them with all the memory the machine has, it will just lead to the OS swapping the JVM out to disk and considerable slow things down.

I would suggest a heap size of 1.5 or 2.0 GB for each node, and have a read of the JVM Heap Size section here http://wiki.apache.org/cassandra/MemtableThresholds . AFAIK the logs are showing your cluster was under heavy GC pressure. 

Finally, the ActiveCount error message was a known issue in beta 2. Treat yourself and try RC1 :)

Aaron



On 02 Dec, 2010,at 12:33 AM, asakka <amin.sakka@novapost.fr> wrote:


Hello,

I'm making some tests on a data model with 3 CF and 1 SCF, I want to start
by inserting 1 million rows (my target is to have 1billion rows) .
I have three nodes cluster (I'm using the same machines with 3GB of RAM
each , intel core2 duo 1,6GHZ), RF = 2, CL = 1, HEAPSIZE of the seed = 3GO
(it was 1.5GO, I've doubled it to avoid the heap size exception I had) , the
other two nodes are 1.5GO.

I am using cassandra (V0.7.0-beta2) and Hector (V0.7.0.18) . I'm making
insertion in batch mode using hector Mutator.
My disk_access_mode is standard.
I reduced also my memtable_throughput_in_mb to 64, but the problem persists
and I have the following exception :
I want to know if it is a configuration or hardware problem ?

INFO [Timer-0] 2010-12-01 10:34:42,124 Gossiper.java (line 196) InetAddress
/10.0.100.215 is now dead.
INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,188 Gossiper.java (line 594) Node
/10.0.100.215 has restarted, now UP again
INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,189 StorageService.java (line
643) Node /10.0.100.215 state jump to normal
INFO [GOSSIP_STAGE:1] 2010-12-01 10:34:44,189 StorageService.java (line
650) Will not change my token ownership to /10.0.100.215
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:34:44,189
HintedHandOffManager.java (line 196) Started hinted handoff for endpoint
/10.0.100.215
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:34:44,189
HintedHandOffManager.java (line 252) Finished hinted handoff of 0 rows to
endpoint /10.0.100.215
INFO [GC inspection] 2010-12-01 10:40:29,141 GCInspector.java (line 129) GC
for ParNew: 750 ms, 14693208 reclaimed leaving 2140055192 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:30,280 GCInspector.java (line 129) GC
for ParNew: 445 ms, 17042288 reclaimed leaving 2178211008 used; max is
3355312128
INFO [WRITE-/10.0.100.214] 2010-12-01 10:40:31,552
OutboundTcpConnection.java (line 115) error writing to /10.0.100.214
INFO [GC inspection] 2010-12-01 10:40:32,280 GCInspector.java (line 129) GC
for ParNew: 211 ms, 25550568 reclaimed leaving 2235227312 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:34,320 GCInspector.java (line 129) GC
for ParNew: 290 ms, 26512896 reclaimed leaving 2277013184 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:35,950 GCInspectorjava (line 129) GC

for ParNew: 506 ms, 24319976 reclaimed leaving 2303739672 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:37,202 GCInspector.java (line 129) GC
for ParNew: 462 ms, 31759008 reclaimed leaving 2306914712 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:42,629 GCInspector.java (line 129) GC
for ParNew: 445 ms, 14769312 reclaimed leaving 2327064920 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:43,969 GCInspector.java (line 129) GC
for ParNew: 720 ms, 14804208 reclaimed leaving 2366434112 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:45,372 GCInspector.java (line 129) GC
for ParNew: 325 ms, 23112128 reclaimed leaving 2421032952 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:40:47,843 GCInspector.java (line 129) GC
for ParNew: 801 ms, 26014296 reclaimed leaving 2474278880 used; max is
3355312128
INFO [Timer-0] 2010-12-01 10:41:18,451 Gossiper.java (line 196) InetAddress
/10.0.100.215 is now dead.
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:41:19,362
HintedHandOffManager.java (line 196) Started hinted handoff for endpoint
/10.0.100.215
INFO [HINTED-HANDOFF-POOL:1] 2010-12-01 10:41:19,975
HintedHandOffManager.java (line 252) Finished hinted handoff of 0 rows to
endpoint /10.0.100.215
INFO [GOSSIP_STAGE:1] 2010-12-01 10:41:19,506 Gossiper.java (line 580)
InetAddress /10.0.100.215 is now UP
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:28,873 SSTablejava (line

145) Deleted /var/lib/cassandra/data/SAE4/Document-e-20-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:28,952 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-148-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,053 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-7-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,163 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-12-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,274 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-13-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,407 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-13-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,513 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-17-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,545 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-7-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,577 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-146-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,776 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-2-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,784 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-145-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,882 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Document-e-19-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,883 SSTable.java (line
145) Deleted /var/lib/cassandra/data/system/LocationInfo-e-147-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,884 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-14-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,886 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-5-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,887 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-7-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,887 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-8-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,888 SSTablejava (line

145) Deleted /var/lib/cassandra/data/SAE4/Account-e-16-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,917 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-3-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,918 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/DocumentByFolder-e-6-<>
INFO [SSTABLE-CLEANUP-TIMER] 2010-12-01 10:41:29,918 SSTable.java (line
145) Deleted /var/lib/cassandra/data/SAE4/Account-e-15-<>
INFO [GC inspection] 2010-12-01 10:51:07,737 GCInspector.java (line 129) GC
for ParNew: 496 ms, 36105328 reclaimed leaving 274154728 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:51:43,742 GCInspector.java (line 129) GC
for ParNew: 3386 ms, 12099384 reclaimed leaving 297145056 used; max is
3355312128
INFO [GC inspection] 2010-12-01 10:51:45,487 GCInspector.java (line 150)
Pool Name Active Pending
INFO [GC inspection] 2010-12-01 10:51:45,706 GCInspector.java (line 156)
MIGRATION_STAGE 0 0
INFO [GC inspection] 2010-12-01 10:51:45,716 GCInspector.java (line 156)
GOSSIP_STAGE 0 0
ERROR [GC inspection] 2010-12-01 10:51:46,313 AbstractCassandraDaemon.java
(line 88) Fatal exception in thread Thread[GC inspection,5,main]
java.lang.reflect.UndeclaredThrowableException
at $Proxy1.getActiveCount(Unknown Source)
at
org.apache.cassandra.service.GCInspector.logThreadPoolStats(GCInspector.java:156)
at
org.apache.cassandra.service.GCInspector.logIntervalGCStats(GCInspector.java:136)
at org.apache.cassandra.service.GCInspector.access$000(GCInspector.java:39)
at org.apache.cassandra.service.GCInspector$1.run(GCInspector.java:93)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
Caused by: javax.management.AttributeNotFoundException: No such attribute:
ActiveCount
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:63)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
at
javax.managementMBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:263)



--

Amin SAKKA
Research and Development Engineer

32 rue de Paradis, 75010 Paris
Tel: +33 (0)6 34 14 19 25
Mail: amin.sakka@novapost.fr
Web: www.novapost.fr / www.novapost-rh.fr








--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com