Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 71034 invoked from network); 7 Feb 2011 08:52:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Feb 2011 08:52:43 -0000 Received: (qmail 68805 invoked by uid 500); 7 Feb 2011 08:52:41 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 68592 invoked by uid 500); 7 Feb 2011 08:52:38 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68584 invoked by uid 99); 7 Feb 2011 08:52:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Feb 2011 08:52:38 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of patrik.modesto@gmail.com designates 209.85.216.44 as permitted sender) Received: from [209.85.216.44] (HELO mail-qw0-f44.google.com) (209.85.216.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Feb 2011 08:52:31 +0000 Received: by qwi2 with SMTP id 2so3552609qwi.31 for ; Mon, 07 Feb 2011 00:52:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type; bh=F+HIl9bm5KoP/38qZoQY1vBwv+Up8K26usX7GOhGbu8=; b=wLEO3f0uwjzbAP2RbbLUw7vo5rr7BpRof5lyuNn0+lqXXoqyKlSJ4ub4PF6mdtEuGY N4+OFA0uFnGIzTn1x+kOVFnkOVA0DLnxjdFi96cXuY3sNya0LWawcdcUP6gsenxjAYIl 0HlUcfjARZU6dIh2k/plPUCAkyMpZTpXpP1NI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=QYgtEvGV2mBrFJrZl+bvdxmqW0mpPfeEpOYSAApcbc2xHPRdpTrcMmN7vW42jCevJt UuvNvX0+ubQji7feVgMes6hYPpuVBNRTlgTVfvc8635xu6hnizL2ycRCHIJHr+LJrN0W T0f+jj8mPE0At2vWZrWXnWmudNwAIsh0cti7o= Received: by 10.229.100.76 with SMTP id x12mr12570751qcn.252.1297068730802; Mon, 07 Feb 2011 00:52:10 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.184.12 with HTTP; Mon, 7 Feb 2011 00:51:49 -0800 (PST) From: Patrik Modesto Date: Mon, 7 Feb 2011 09:51:49 +0100 Message-ID: Subject: OOM during batch_mutate To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hi all! I'm running into OOM problem during batch_mutate. I've a test cluster of two servers, 32GB and 16GB RAM, real HW. I've one keyspace and one CF with 1,4mil rows, each 10 columns. A row is around 5k in size. I run Hadoop MR task that reads one column and generates Mutation that updates another column, it writes about 1k string there. Strange thing is I can run this MR three times without any error on a freshly created/filled CF but the 4th time Cassandra on the 16GB server hits OOM. After I run 'nodetool compact' on the 16GB server I can run the MR again without any OOM one or two times. I'm quite clueless on what to change in Cassandra/CF settings. Can anyone help me? Regards, Patrik output.log: [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor30] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor34] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor20] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor10] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor32] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor19] [Unloading class sun.reflect.GeneratedMethodAccessor9] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor33] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor16] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor15] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor23] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor43] WARN 09:28:46,172 Dropped 2 MUTATION messages in the last 5000ms WARN 09:30:10,555 Dropped 7 REQUEST_RESPONSE messages in the last 5000ms INFO 09:30:10,556 Pool Name Active Pending INFO 09:30:10,556 ReadStage 0 0 INFO 09:30:10,556 RequestResponseStage 0 50 INFO 09:30:10,557 ReadRepair 0 0 INFO 09:30:10,557 MutationStage 16 26 INFO 09:30:10,557 GossipStage 1 8 INFO 09:30:10,558 AntiEntropyStage 0 0 INFO 09:30:10,558 MigrationStage 0 0 INFO 09:30:10,558 StreamStage 0 0 ERROR 09:30:10,561 Internal error processing batch_mutate java.lang.NoClassDefFoundError: Could not initialize class org.apache.cassandra.thrift.TimedOutException at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:417) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:385) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3036) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR 09:30:10,675 Fatal exception in thread Thread[MutationStage:14,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2786) at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.cassandra.utils.FBUtilities.writeByteArray(FBUtilities.java:266) at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:56) at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:35) at org.apache.cassandra.db.ColumnFamilySerializer.serializeForSSTable(ColumnFamilySerializer.java:87) at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73) at org.apache.cassandra.db.RowMutationSerializer.freezeTheMaps(RowMutation.java:364) at org.apache.cassandra.db.RowMutationSerializer.serialize(RowMutation.java:375) at org.apache.cassandra.db.RowMutationSerializer.serialize(RowMutation.java:353) at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:274) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:194) at org.apache.cassandra.service.StorageProxy$1.runMayThrow(StorageProxy.java:202) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR 09:30:10,675 Internal error processing batch_mutate java.lang.NoClassDefFoundError: Could not initialize class org.apache.cassandra.thrift.TimedOutException at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:417) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:385) at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3036) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) # grep GC output.log INFO 09:23:17,304 GC for ParNew: 208 ms, 214815512 reclaimed leaving 2926374208 used; max is 4394123264 INFO 09:23:28,440 GC for ParNew: 231 ms, 130295480 reclaimed leaving 1892691240 used; max is 4394123264 INFO 09:24:20,697 GC for ParNew: 283 ms, 242198536 reclaimed leaving 3463283016 used; max is 4394123264 INFO 09:24:23,017 GC for ParNew: 349 ms, 236482304 reclaimed leaving 3570011520 used; max is 4394123264 INFO 09:24:29,023 GC for ParNew: 210 ms, 237546632 reclaimed leaving 3706943376 used; max is 4394123264 INFO 09:24:30,025 GC for ParNew: 244 ms, 227230144 reclaimed leaving 3774754840 used; max is 4394123264 INFO 09:24:33,150 GC for ParNew: 308 ms, 224139024 reclaimed leaving 3864451360 used; max is 4394123264 INFO 09:24:39,344 GC for ConcurrentMarkSweep: 4367 ms, 346142816 reclaimed leaving 3823028040 used; max is 4394123264 INFO 09:24:51,853 GC for ConcurrentMarkSweep: 3545 ms, 371305240 reclaimed leaving 3899939616 used; max is 4394123264 INFO 09:24:55,550 GC for ConcurrentMarkSweep: 3370 ms, 225070424 reclaimed leaving 3993201552 used; max is 4394123264 INFO 09:24:58,735 GC for ConcurrentMarkSweep: 3058 ms, 202047008 reclaimed leaving 4078458520 used; max is 4394123264 INFO 09:25:02,120 GC for ConcurrentMarkSweep: 3227 ms, 165408344 reclaimed leaving 4115124104 used; max is 4394123264 INFO 09:25:05,238 GC for ConcurrentMarkSweep: 2970 ms, 123361296 reclaimed leaving 4157073152 used; max is 4394123264 INFO 09:25:08,364 GC for ConcurrentMarkSweep: 3040 ms, 98989656 reclaimed leaving 4181504664 used; max is 4394123264 INFO 09:25:11,329 GC for ConcurrentMarkSweep: 2878 ms, 86209304 reclaimed leaving 4194290064 used; max is 4394123264 INFO 09:25:14,502 GC for ConcurrentMarkSweep: 3049 ms, 74116080 reclaimed leaving 4206392488 used; max is 4394123264 INFO 09:25:17,115 GC for ConcurrentMarkSweep: 2486 ms, 69710048 reclaimed leaving 4210788272 used; max is 4394123264 INFO 09:25:19,302 GC for ConcurrentMarkSweep: 2084 ms, 63294216 reclaimed leaving 4217173136 used; max is 4394123264 INFO 09:25:21,894 GC for ConcurrentMarkSweep: 2506 ms, 59269376 reclaimed leaving 4221213776 used; max is 4394123264 INFO 09:25:25,138 GC for ConcurrentMarkSweep: 3112 ms, 54145144 reclaimed leaving 4226376024 used; max is 4394123264 INFO 09:25:28,128 GC for ConcurrentMarkSweep: 2922 ms, 44996224 reclaimed leaving 4235493792 used; max is 4394123264 INFO 09:25:32,351 GC for ConcurrentMarkSweep: 1900 ms, 40476296 reclaimed leaving 4240051776 used; max is 4394123264 INFO 09:25:38,424 GC for ConcurrentMarkSweep: 2893 ms, 40763344 reclaimed leaving 4239775792 used; max is 4394123264 INFO 09:25:41,470 GC for ConcurrentMarkSweep: 2897 ms, 36781832 reclaimed leaving 4243728632 used; max is 4394123264 INFO 09:25:44,532 GC for ConcurrentMarkSweep: 2881 ms, 32509720 reclaimed leaving 4248018432 used; max is 4394123264 INFO 09:25:50,557 GC for ConcurrentMarkSweep: 2805 ms, 20772088 reclaimed leaving 4259756616 used; max is 4394123264 INFO 09:26:03,058 GC for ConcurrentMarkSweep: 3014 ms, 16184824 reclaimed leaving 4264332256 used; max is 4394123264 INFO 09:26:09,304 GC for ConcurrentMarkSweep: 2893 ms, 14120928 reclaimed leaving 4266227984 used; max is 4394123264 INFO 09:26:20,919 GC for ConcurrentMarkSweep: 1971 ms, 13156344 reclaimed leaving 4267375248 used; max is 4394123264 INFO 09:30:14,111 GC for ConcurrentMarkSweep: 3187 ms, 85032848 reclaimed leaving 4195458368 used; max is 4394123264