Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 54950 invoked from network); 13 Mar 2011 20:49:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Mar 2011 20:49:52 -0000 Received: (qmail 5840 invoked by uid 500); 13 Mar 2011 20:49:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 5812 invoked by uid 500); 13 Mar 2011 20:49:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 5804 invoked by uid 99); 13 Mar 2011 20:49:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 20:49:50 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a42.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 20:49:42 +0000 Received: from homiemail-a42.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTP id 0A13768C05D for ; Sun, 13 Mar 2011 13:49:16 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; q=dns; s=thelastpickle.com; b=f/d3EUTwIC aJDgVLvTLTuMWY5Mox6LC684Zr9oRRvA5bF/7x/Wt6de5E1o6vJRwlPxmDaXomoP SQXrEdxQruRv6PezZRvYRyXOLoWe3IIk8iQCkjjZoCc5fhnQv/1rnFB89jg1Q/cf 3TRORRHxJEb8yU84d7CmcwTAVPG2vl/kQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :mime-version:content-type:subject:date:in-reply-to:to :references:message-id; s=thelastpickle.com; bh=Nn2Or/bXtySnmM1N /z1Qypa4WEE=; b=J6cSqipUqWMllijZClvPPdGT+UqLDUoji5SCQX1m0DW4X0bq GLSLwuAlnjOnkFHPXchVo7u6yJr/xb5jGN6CqbBWg0xQ0OSGTb2bVyrlR16aRQ1+ NRjD8X6eGnDiiDg9oTba6Hem8CSXqCPWWui/d+ssiFhLtG9JmHiC/+hjKoE= Received: from [10.0.1.159] (121-73-157-230.cable.telstraclear.net [121.73.157.230]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a42.g.dreamhost.com (Postfix) with ESMTPSA id 4F52868C057 for ; Sun, 13 Mar 2011 13:49:15 -0700 (PDT) From: aaron morton Mime-Version: 1.0 (Apple Message framework v1082.1) Content-Type: multipart/alternative; boundary=Apple-Mail-3--380394547 Subject: Re: Strange behaivour Date: Mon, 14 Mar 2011 09:49:12 +1300 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: <7C88E5B0-8CF7-4164-89E8-EC9DCA6541AB@thelastpickle.com> X-Mailer: Apple Mail (2.1082.1) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-3--380394547 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii It's difficult to say what's causing the freeze.=20 Was the node rejecting client connections during this time ?=20 Did any of the other nodes log that the node that was freezing was down = ?=20 Is there anything else running on the box? Aaron On 14 Mar 2011, at 07:02, ruslan usifov wrote: > Hello >=20 > I some times see follow behaviour: java on node hung, with follow cpu = usage: >=20 > >=20 >=20 > With hole in logs where it hung: >=20 >=20 > Here is part of my gc.log (between 19:07 and 19:41 is hole) >=20 > 2011-03-13T19:07:16.339+0300: 64970.498: [GC 64970.498: [ParNew: = 1249324K->96851K(1386112K), 0.0305170 secs] 2867161K->1714688K(600844 > 8K), 0.0307610 secs] [Times: user=3D0.39 sys=3D0.00, real=3D0.03 secs] > 2011-03-13T19:07:20.460+0300: 64974.619: [GC 64974.620: [ParNew: = 1328979K->12323K(1386112K), 0.0453270 secs] 2946816K->1722867K(600844 > 8K), 0.0455710 secs] [Times: user=3D0.53 sys=3D0.00, real=3D0.05 secs] > 2011-03-13T19:07:26.629+0300: 64980.788: [GC 64980.788: [ParNew: = 1244451K->19922K(1386112K), 0.0213010 secs] 2954995K->1730467K(600844 > 8K), 0.0215640 secs] [Times: user=3D0.26 sys=3D0.00, real=3D0.02 secs] > 2011-03-13T19:07:41.081+0300: 64995.241: [GC 64995.241: [ParNew: = 1252050K->29988K(1386112K), 0.0320390 secs] 2962595K->1747546K(600844 > 8K), 0.0323010 secs] [Times: user=3D0.38 sys=3D0.01, real=3D0.03 secs] > 2011-03-13T19:07:56.376+0300: 65010.536: [GC 65010.536: [ParNew: = 1262116K->34260K(1386112K), 0.0401930 secs] 2979674K->1758982K(600844 > 8K), 0.0404350 secs] [Times: user=3D0.40 sys=3D0.02, real=3D0.04 secs] > 2011-03-13T19:41:16.921+0300: 67011.081: [GC 67011.081: [ParNew: = 1266388K->38705K(1386112K), 0.0480950 secs] 2991110K->1775874K(600844 > 8K), 0.0483710 secs] [Times: user=3D0.37 sys=3D0.00, real=3D0.05 secs] > 2011-03-13T19:41:32.116+0300: 67026.276: [GC 67026.276: [ParNew: = 1270833K->44743K(1386112K), 0.0541030 secs] 3008002K->1792109K(600844 > 8K), 0.0543190 secs] [Times: user=3D0.44 sys=3D0.02, real=3D0.05 secs] > 2011-03-13T19:41:39.694+0300: 67033.854: [GC 67033.854: [ParNew: = 1276871K->48249K(1386112K), 0.0496830 secs] 3024237K->1810949K(600844 > 8K), 0.0499810 secs] [Times: user=3D0.47 sys=3D0.01, real=3D0.06 secs] > 2011-03-13T19:41:45.433+0300: 67039.592: [GC 67039.592: [ParNew: = 1280377K->53961K(1386112K), 0.0518250 secs] 3043077K->1829344K(600844 > 8K), 0.0520640 secs] [Times: user=3D0.61 sys=3D0.00, real=3D0.06 secs] >=20 >=20 >=20 > Here is part of my system.log: >=20 > andra/commitlog/CommitLog-1300028248189.log) > INFO [COMMIT-LOG-WRITER] 2011-03-13 19:07:50,067 CommitLog.java (line = 437) Discarding obsolete commit log:CommitLogSegment(/home/cass > andra/commitlog/CommitLog-1300029444680.log) > INFO [ScheduledTasks:1] 2011-03-13 19:41:18,007 Gossiper.java (line = 224) InetAddress /10.24.84.4 is now dead. > INFO [ScheduledTasks:1] 2011-03-13 19:41:18,007 Gossiper.java (line = 224) InetAddress /10.32.59.194 is now dead. >=20 >=20 > root@slv004:~# java -version > java version "1.6.0_24" > Java(TM) SE Runtime Environment (build 1.6.0_24-b07) > Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) >=20 >=20 > root@slv004:~# uname -a > Linux slv004 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 05:14:15 UTC = 2010 x86_64 GNU/Linux >=20 >=20 >=20 --Apple-Mail-3--380394547 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=us-ascii It's difficult to say what's causing the freeze. 

Was the node rejecting client connections during this time ? 
Did any of the other nodes log that the node that was freezing was down ? 
Is there anything else running on the box?

Aaron

On 14 Mar 2011, at 07:02, ruslan usifov wrote:

Hello

I some times see follow behaviour: java on node hung, with follow cpu usage:

<strange.jpg>


With hole in logs where it hung:


Here is part of my gc.log (between 19:07 and 19:41 is hole)

2011-03-13T19:07:16.339+0300: 64970.498: [GC 64970.498: [ParNew: 1249324K->96851K(1386112K), 0.0305170 secs] 2867161K->1714688K(600844
8K), 0.0307610 secs] [Times: user=0.39 sys=0.00, real=0.03 secs]
2011-03-13T19:07:20.460+0300: 64974.619: [GC 64974.620: [ParNew: 1328979K->12323K(1386112K), 0.0453270 secs] 2946816K->1722867K(600844
8K), 0.0455710 secs] [Times: user=0.53 sys=0.00, real=0.05 secs]
2011-03-13T19:07:26.629+0300: 64980.788: [GC 64980.788: [ParNew: 1244451K->19922K(1386112K), 0.0213010 secs] 2954995K->1730467K(600844
8K), 0.0215640 secs] [Times: user=0.26 sys=0.00, real=0.02 secs]
2011-03-13T19:07:41.081+0300: 64995.241: [GC 64995.241: [ParNew: 1252050K->29988K(1386112K), 0.0320390 secs] 2962595K->1747546K(600844
8K), 0.0323010 secs] [Times: user=0.38 sys=0.01, real=0.03 secs]
2011-03-13T19:07:56.376+0300: 65010.536: [GC 65010.536: [ParNew: 1262116K->34260K(1386112K), 0.0401930 secs] 2979674K->1758982K(600844
8K), 0.0404350 secs] [Times: user=0.40 sys=0.02, real=0.04 secs]
2011-03-13T19:41:16.921+0300: 67011.081: [GC 67011.081: [ParNew: 1266388K->38705K(1386112K), 0.0480950 secs] 2991110K->1775874K(600844
8K), 0.0483710 secs] [Times: user=0.37 sys=0.00, real=0.05 secs]
2011-03-13T19:41:32.116+0300: 67026.276: [GC 67026.276: [ParNew: 1270833K->44743K(1386112K), 0.0541030 secs] 3008002K->1792109K(600844
8K), 0.0543190 secs] [Times: user=0.44 sys=0.02, real=0.05 secs]
2011-03-13T19:41:39.694+0300: 67033.854: [GC 67033.854: [ParNew: 1276871K->48249K(1386112K), 0.0496830 secs] 3024237K->1810949K(600844
8K), 0.0499810 secs] [Times: user=0.47 sys=0.01, real=0.06 secs]
2011-03-13T19:41:45.433+0300: 67039.592: [GC 67039.592: [ParNew: 1280377K->53961K(1386112K), 0.0518250 secs] 3043077K->1829344K(600844
8K), 0.0520640 secs] [Times: user=0.61 sys=0.00, real=0.06 secs]



Here is part of my system.log:

andra/commitlog/CommitLog-1300028248189.log)
 INFO [COMMIT-LOG-WRITER] 2011-03-13 19:07:50,067 CommitLog.java (line 437) Discarding obsolete commit log:CommitLogSegment(/home/cass
andra/commitlog/CommitLog-1300029444680.log)
 INFO [ScheduledTasks:1] 2011-03-13 19:41:18,007 Gossiper.java (line 224) InetAddress /10.24.84.4 is now dead.
 INFO [ScheduledTasks:1] 2011-03-13 19:41:18,007 Gossiper.java (line 224) InetAddress /10.32.59.194 is now dead.


root@slv004:~# java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)


root@slv004:~# uname -a
Linux slv004 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 05:14:15 UTC 2010 x86_64 GNU/Linux




--Apple-Mail-3--380394547--