Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D514AC859 for ; Tue, 24 Jul 2012 08:04:44 +0000 (UTC) Received: (qmail 85038 invoked by uid 500); 24 Jul 2012 08:04:42 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 84643 invoked by uid 500); 24 Jul 2012 08:04:39 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 84344 invoked by uid 99); 24 Jul 2012 08:04:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 08:04:37 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nkovshov@yandex.ru designates 77.88.60.125 as permitted sender) Received: from [77.88.60.125] (HELO forward6.mail.yandex.net) (77.88.60.125) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Jul 2012 08:04:31 +0000 Received: from web5e.yandex.ru (web5e.yandex.ru [77.88.60.74]) by forward6.mail.yandex.net (Yandex) with ESMTP id E0D2911229C8 for ; Tue, 24 Jul 2012 12:04:08 +0400 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1343117048; bh=8T3kHjMHQf/gM8622PxrTyUj/2QnCAWUpWTRhUyUe6Y=; h=From:To:In-Reply-To:References:Subject:MIME-Version:Message-Id: Date:Content-Transfer-Encoding:Content-Type; b=vQ6UN8932t2SFiSTVqESD17KAFj+QtEztVJBjLzXaSZyNO0q1RfAlAfkFIj3Gt7OV 7zJH+jaOr7BLqeym6LnjFETi+ao7gqdEAVhbg3k5+DJpMaawbSArCI8z+fwixipABQ qx/ZpRvtfwo8iIKYFXbLJVBn8OFKyv58EH1pHVBA= Received: from 127.0.0.1 (localhost.localdomain [127.0.0.1]) by web5e.yandex.ru (Yandex) with ESMTP id 9EE8D1278042; Tue, 24 Jul 2012 12:04:08 +0400 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1343117048; bh=8T3kHjMHQf/gM8622PxrTyUj/2QnCAWUpWTRhUyUe6Y=; h=From:To:In-Reply-To:References:Subject:Date; b=QeeqshPSRTqDi+VVyxIcWClIWKgmbBAzFAjReBQKoi47Vy7E7GYn+oBBaanieRViJ ueP/jjS+T/GReaxK1ygaVBMiK2vtOasATxhhBOG0cSRaM9/qIrZkp5YJcSqa7XdkPv P9cRwJhvbYbv4MAo/34uS/+i9+pgFi36vDftPB1c= Received: from [46.46.146.98] ([46.46.146.98]) by web5e.yandex.ru with HTTP; Tue, 24 Jul 2012 12:04:07 +0400 From: =?koi8-r?B?Tmlrb2xheSBLz3ZzaG92?= To: "user@cassandra.apache.org" In-Reply-To: References: <580941342512842@web13h.yandex.ru> <2302731343030866@web4d.yandex.ru> Subject: Re: Cassandra 1.0 hangs during GC MIME-Version: 1.0 Message-Id: <2101731343117047@web5e.yandex.ru> Date: Tue, 24 Jul 2012 12:04:07 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=koi8-r X-Virus-Checked: Checked by ClamAV on apache.org 48 G of Ram on that machine, swap is not used. I will disable swap at all just in case I have 4 cassandra processes (parts of 4 different clusters), each allocated 8 GB and using 4 of them >java -version java version "1.7.0" Java(TM) SE Runtime Environment (build 1.7.0-b147) Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode) 23.07.2012, 20:12, "Joost van de Wijgerd" : > Howmuch memory do you have on the machine. Seems like you have 8G > reserved for the Cassandra java process, If this is all the memory on > the machine you might be swapping. Also which jvm do you use? > > kind regards > > Joost > > On Mon, Jul 23, 2012 at 10:07 AM, Nikolay K�vshov wrote: > >> ��21th I have mirgated to cassandra 1.1.2 but see no improvement >> >> �cat /var/log/cassandra/Earth1.log | grep "GC for" >> �INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 8464105472 >> �=== Migrated from 1.0.0 to 1.1.2 >> �INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 8464105472 >> �INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 8464105472 >> >> �Memory and yaml memory-related settings are default >> �I do not do deletes >> �I have 2 CF's and no secondary indexes >> >> �LiveRatio's: >> ��INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). �calculation took 85ms for 6236 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). �calculation took 8ms for 1 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). �calculation took 99ms for 1094 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). �calculation took 80ms for 242 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). �calculation took 505ms for 2678 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). �calculation took 776ms for 5236 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). �calculation took 64ms for 389 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted was 1.0). �calculation took 1378ms for 8948 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 (just-counted was 1.0). �calculation took 140ms for 1082 columns >> ��INFO [MemoryMeter:1] 2012-07-21 09:05:08,439 Memtable.java (line 213) CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 2.5763038186160894 (just-counted was 2.5763038186160894). �calculation took 8796ms for 102193 columns >> >> �18.07.2012, 07:51, "aaron morton" : >>> �Assuming all the memory and yaml settings default that does not sound right. The first thought would be the memory meter not counting correctly... >>> �Do you do a lot of deletes ? >>> �Do you have a lot of CF's and/or secondary indexes ? >>> �Can you see log lines about the "liveRatio" for your cf's ? >>> �I would upgrade to 1.0.10 before getting too carried away though. >>> �Cheers >>> >>> �----------------- >>> �Aaron Morton >>> �Freelance Developer >>> �@aaronmorton >>> �http://www.thelastpickle.com >>> >>> �On 17/07/2012, at 8:14 PM, Nikolay K�vshov wrote: >>>> �This is a cluster of 2 nodes, each having 8G of operating memory, replicationfactor=2 >>>> �Write/read pressure is quite low and almost never exceeds 10/second >>>> >>>> �From time to time (2-3 times in a month) I see GC activity in logs and for this time cassandra stops responding to requests which results in a timeout in upper-layer application. Total time of unavailability can be over 5 minues (like in the following case) >>>> >>>> �What can I do with that? Wiil it become much worse when my cluster grows up? >>>> >>>> �INFO [GossipTasks:1] 2012-07-16 13:10:50,055 Gossiper.java (line 736) InetAddress /10.220.50.9 is now dead. >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,056 GCInspector.java (line 123) GC for ParNew: 391383 ms for 1 collections, 2025808488 used; max is 8464105472 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,067 StatusLogger.java (line 50) Pool Name �������������������Active ��Pending ��Blocked >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,071 StatusLogger.java (line 65) ReadStage ������������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,071 StatusLogger.java (line 65) RequestResponseStage �������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,072 StatusLogger.java (line 65) ReadRepairStage ������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,072 StatusLogger.java (line 65) MutationStage ��������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,073 StatusLogger.java (line 65) ReplicateOnWriteStage ������������0 ��������0 ��������0 >>>> ��INFO [GossipStage:1] 2012-07-16 13:10:50,074 Gossiper.java (line 722) InetAddress /10.220.50.9 is now UP >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,075 StatusLogger.java (line 65) GossipStage ����������������������1 �������59 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,075 StatusLogger.java (line 65) AntiEntropyStage �����������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,076 StatusLogger.java (line 65) MigrationStage �������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,076 StatusLogger.java (line 65) StreamStage ����������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) MemtablePostFlusher ��������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) FlushWriter ����������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,077 StatusLogger.java (line 65) MiscStage ������������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,078 StatusLogger.java (line 65) InternalResponseStage ������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,078 StatusLogger.java (line 65) HintedHandoff ��������������������0 ��������0 ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,079 StatusLogger.java (line 69) CompactionManager ��������������n/a ��������0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,079 StatusLogger.java (line 81) MessagingService ���������������n/a ������0,0 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,080 StatusLogger.java (line 85) ColumnFamily ���������������Memtable ops,data �Row cache size/cap �Key cache size/cap >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,080 StatusLogger.java (line 88) Keyspace1.PSS ����������������36712,343842617 ����������������0/0 ������97995/1000000 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,085 StatusLogger.java (line 88) Keyspace1.Standard1 ���������128679,162567721 ����������������0/0 ����������0/1000000 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,085 StatusLogger.java (line 88) system.NodeIdInfo ������������������������0,0 ����������������0/0 ����������������0/1 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) system.IndexInfo �������������������������0,0 ����������������0/0 ����������������0/1 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) system.LocationInfo ����������������������0,0 ����������������0/0 ����������������1/2 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,086 StatusLogger.java (line 88) system.Migrations ������������������������0,0 ����������������0/0 ����������������3/3 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,087 StatusLogger.java (line 88) system.HintsColumnFamily ���������������1,528 ����������������0/0 ����������������0/1 >>>> ��INFO [ScheduledTasks:1] 2012-07-16 13:10:50,087 StatusLogger.java (line 88) system.Schema ����������������������������0,0 ����������������0/0 ����������������2/3 >>>> �-------- ���������� ������������� ��������� -------- > -- > Joost van de Wijgerd > Visseringstraat 21B > 1051KH Amsterdam > +31624111401 > joost.van.de.wijgerd@Skype > http://www.linkedin.com/in/jwijgerd