Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D1AF7C8D3 for ; Wed, 30 May 2012 23:44:52 +0000 (UTC) Received: (qmail 62435 invoked by uid 500); 30 May 2012 23:44:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 62412 invoked by uid 500); 30 May 2012 23:44:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 62404 invoked by uid 99); 30 May 2012 23:44:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 May 2012 23:44:50 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a43.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 May 2012 23:44:41 +0000 Received: from homiemail-a43.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a43.g.dreamhost.com (Postfix) with ESMTP id 778778C05F for ; Wed, 30 May 2012 16:44:19 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=OUwt0qOExS hsTsU6nEOjbyTT1MbbSwa+3MEOxfjIaKQBrIYl2TM6Wrs4ueY+6NKB9UVl3ViuJo ek6Cdm5ULaumD2QUD9m28ZF1Svj5XlgoWg92QoASR44PYu08h8THq3qZ4vIfRB6G dvM3NGg5yyNYBGb/3zwUA65FtnWXWIrvk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=saJQm1BLIf3zKPhq f8WW5fAim8Y=; b=ktofwRQubHfXPpzhVjGTOrzinIFmHuVdaYHA3x1eNvPRpXIP OShebWmSNgSVRB+i+s/JBGSSklTR8kFD6r+es/cEmTbECr5haYIY2kbMbnJws+f8 VEfheSNLzexw0d4rbVvCOCmWCytD/VTbTA4aK2Q0H5HYjsHIlDAn4uiHrh4= Received: from [172.16.1.4] (unknown [203.86.207.101]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a43.g.dreamhost.com (Postfix) with ESMTPSA id DC8A28C05D for ; Wed, 30 May 2012 16:44:18 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: multipart/alternative; boundary="Apple-Mail=_9932D003-D2EE-401E-8004-B9F07AB43A78" Subject: Re: TimedOutException caused by "Stop the world" activity Date: Thu, 31 May 2012 11:44:16 +1200 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: <67E0178B-E58C-4887-B84A-6BFC82C5A641@thelastpickle.com> X-Mailer: Apple Mail (2.1278) --Apple-Mail=_9932D003-D2EE-401E-8004-B9F07AB43A78 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 The cluster is running into GC problems and this is slowing it down = under the stress test. When it slows down one or more of the nodes is = failing to perform the write within rpc_timeout . This causes the = coordinator of the write to raise the TimedOutException.=20 You options are: * allocate more memory * ease back on the stress test.=20 * work as a CL QUORUM so that one node failing does result in the error.=20= see also = http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts Cheers =20 ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/05/2012, at 12:59 PM, Jason Tang wrote: > Hi >=20 > My system is 4 nodes 64 bit cassandra cluster, 6G big per node,default = configuration (which means 1/3 heap for memtable), replicate number 3, = write all, read one. > When I run stress load testing, I got this TimedOutException, and some = operation failed, and all traffic hang for a while.=20 >=20 > And when I have 1G memory 32 bit cassandra on standalone model, I = didn't find so frequently "Stop the world" behavior. >=20 > So I wonder what kind of operation will hang the cassandra system.=20 >=20 > How to collect information for tuning. >=20 > =46rom the system log and document, I guess there are three type = operations: > 1) Flush memtable when meet max size > 2) Compact SSTable (why?) > 3) Java GC >=20 > system.log: > INFO [main] 2012-05-25 16:12:17,054 ColumnFamilyStore.java (line 688) = Enqueuing flush of Memtable-LocationInfo@1229893321(53/66 = serialized/live bytes, 2 ops) > INFO [FlushWriter:1] 2012-05-25 16:12:17,054 Memtable.java (line 239) = Writing Memtable-LocationInfo@1229893321(53/66 serialized/live bytes, 2 = ops) > INFO [FlushWriter:1] 2012-05-25 16:12:17,166 Memtable.java (line 275) = Completed flushing = /var/proclog/raw/cassandra/data/system/LocationInfo-hb-2-Data.db (163 = bytes) > ... >=20 > INFO [CompactionExecutor:441] 2012-05-28 08:02:55,345 = CompactionTask.java (line 112) Compacting = [SSTableReader(path=3D'/var/proclog/raw/cassandra/data/myks/queue-hb-41-Da= ta.db'), SSTableReader(path=3D'/var/proclog/raw/cassandra/data/ myks = /queue-hb-32-Data.db'), = SSTableReader(path=3D'/var/proclog/raw/cassandra/data/ myks = /queue-hb-37-Data.db'), = SSTableReader(path=3D'/var/proclog/raw/cassandra/data/ myks = /queue-hb-53-Data.db')] > ... >=20 > WARN [ScheduledTasks:1] 2012-05-28 08:02:26,619 GCInspector.java = (line 146) Heap is 0.7993011015621736 full. You may need to reduce = memtable and/or cache sizes. Cassandra will now flush up to the two = largest memtables to free up memory. Adjust flush_largest_memtables_at = threshold in cassandra.yaml if you don't want Cassandra to do this = automatically > INFO [ScheduledTasks:1] 2012-05-28 08:02:54,980 GCInspector.java = (line 123) GC for ConcurrentMarkSweep: 728 ms for 2 collections, = 3594946600 used; max is 6274678784 > INFO [ScheduledTasks:1] 2012-05-28 08:41:34,030 GCInspector.java = (line 123) GC for ParNew: 1668 ms for 1 collections, 4171503448 used; = max is 6274678784 > INFO [ScheduledTasks:1] 2012-05-28 08:41:48,978 GCInspector.java = (line 123) GC for ParNew: 1087 ms for 1 collections, 2623067496 used; = max is 6274678784 > INFO [ScheduledTasks:1] 2012-05-28 08:41:48,987 GCInspector.java = (line 123) GC for ConcurrentMarkSweep: 3198 ms for 3 collections, = 2623361280 used; max is 6274678784 >=20 >=20 > Timeout Exception: > Caused by: org.apache.cassandra.thrift.TimedOutException: null > at = org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.j= ava:19495) ~[na:na] > at = org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.j= ava:1035) ~[na:na] > at = org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1= 009) ~[na:na] > at = me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceSer= viceImpl.java:95) ~[na:na] > ... 64 common frames omitted >=20 > BRs > //Tang Weiqiang >=20 >=20 --Apple-Mail=_9932D003-D2EE-401E-8004-B9F07AB43A78 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 The = cluster is running into GC problems and this is slowing it down under = the stress test. When it slows down one or more of the nodes is failing = to perform the write within rpc_timeout . This causes the coordinator of = the write to raise the TimedOutException. 

You = options are:

* allocate more memory
* = ease back on the stress test. 
* work as a CL QUORUM so = that one node failing does result in the = error. 


Cheers
 

http://www.thelastpickle.com

On 28/05/2012, at 12:59 PM, Jason Tang wrote:

Hi

My system is 4 nodes 64 bit = cassandra cluster, 6G big per node,default configuration (which means = 1/3 heap for memtable), replicate number 3, write all, read = one.
When I run stress load testing, I got this = TimedOutException, and some operation failed, and all traffic hang for a = while. 

And when I have 1G memory 32 bit cassandra on = standalone model, I didn't find so frequently "Stop the world" = behavior.

So I wonder what kind of operation = will hang the cassandra system. 

How to collect information for = tuning.

=46rom the system log and document, = I guess there are three type operations:
1) Flush = memtable when meet max size
2) Compact SSTable (why?)
3) Java = GC

system.log:
 INFO [main] = 2012-05-25 16:12:17,054 ColumnFamilyStore.java (line 688) Enqueuing = flush of Memtable-LocationInfo@1229893321(53/66 serialized/live bytes, 2 = ops)
 INFO [FlushWriter:1] 2012-05-25 16:12:17,054 Memtable.java = (line 239) Writing Memtable-LocationInfo@1229893321(53/66 = serialized/live bytes, 2 ops)
 INFO [FlushWriter:1] = 2012-05-25 16:12:17,166 Memtable.java (line 275) Completed flushing = /var/proclog/raw/cassandra/data/system/LocationInfo-hb-2-Data.db (163 = bytes)
...

 INFO = [CompactionExecutor:441] 2012-05-28 08:02:55,345 CompactionTask.java = (line 112) Compacting = [SSTableReader(path=3D'/var/proclog/raw/cassandra/data/myks/queue-hb-41-Da= ta.db'), SSTableReader(path=3D'/var/proclog/raw/cassandra/data/ myks /queue-hb-32-Data.db'), = SSTableReader(path=3D'/var/proclog/raw/cassandra/data/ myks /queue-hb-37-Data.db'), = SSTableReader(path=3D'/var/proclog/raw/cassandra/data/ = myks /queue-hb-53-Data.db')]
...

=
 WARN [ScheduledTasks:1] 2012-05-28 08:02:26,619 = GCInspector.java (line 146) Heap is 0.7993011015621736 full.  You = may need to reduce memtable and/or cache sizes.  Cassandra will now = flush up to the two largest memtables to free up memory.  Adjust = flush_largest_memtables_at threshold in cassandra.yaml if you don't want = Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2012-05-28 08:02:54,980 = GCInspector.java (line 123) GC for ConcurrentMarkSweep: 728 ms for 2 = collections, 3594946600 used; max is 6274678784
 INFO = [ScheduledTasks:1] 2012-05-28 08:41:34,030 GCInspector.java (line 123) = GC for ParNew: 1668 ms for 1 collections, 4171503448 used; max is = 6274678784
 INFO [ScheduledTasks:1] 2012-05-28 08:41:48,978 = GCInspector.java (line 123) GC for ParNew: 1087 ms for 1 collections, = 2623067496 used; max is 6274678784
 INFO = [ScheduledTasks:1] 2012-05-28 08:41:48,987 GCInspector.java (line 123) = GC for ConcurrentMarkSweep: 3198 ms for 3 collections, 2623361280 used; = max is 6274678784


Timeout = Exception:
Caused by: = org.apache.cassandra.thrift.TimedOutException: null
  =       at = org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.j= ava:19495) ~[na:na]
        at = org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.j= ava:1035) ~[na:na]
        at = org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1= 009) ~[na:na]
        at = me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceSer= viceImpl.java:95) ~[na:na]
        ... 64 = common frames = omitted

BRs
//Tang = Weiqiang



= --Apple-Mail=_9932D003-D2EE-401E-8004-B9F07AB43A78--