From user-return-16889-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed May 18 04:34:50 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E6D116A7D for ; Wed, 18 May 2011 04:34:50 +0000 (UTC) Received: (qmail 44444 invoked by uid 500); 18 May 2011 04:34:48 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 44422 invoked by uid 500); 18 May 2011 04:34:48 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 44414 invoked by uid 99); 18 May 2011 04:34:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 May 2011 04:34:47 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [98.130.1.164] (HELO mail508.opentransfer.com) (98.130.1.164) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 May 2011 04:34:38 +0000 Received: (qmail 7822 invoked by uid 399); 18 May 2011 04:34:16 -0000 Received: from unknown (HELO ?172.16.100.216?) (202.142.172.194) by mail508.opentransfer.com with ESMTP; 18 May 2011 04:34:16 -0000 X-Originating-IP: 202.142.172.194 Message-ID: <4DD34C47.5090909@panasiangroup.com> Date: Wed, 18 May 2011 09:34:15 +0500 From: Ali Ahsan User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Lightning/1.0b2 Thunderbird/3.1.7 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: Gossiper question References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------000405030507010907090902" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------000405030507010907090902 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 05/18/2011 09:28 AM, Cassa L wrote: > Hi, > I have 9 node cluster with RF-3 and using Cassandra0.70/Hector26. > Recently we are seeing lot of "UnavailableException" at the client > side. Whenever this happens, I found following pattern in Cassandra > node's log file at that given time, > > * INFO [ScheduledTasks:1] 2011-05-13 02:59:55,365 Gossiper.java (line > 195) InetAddress /**.**.***.54 is now dead.* > * INFO [ScheduledTasks:1] 2011-05-13 02:59:57,369 Gossiper.java (line > 195) InetAddress /****.**.***.**59 is now dead.* > INFO [HintedHandoff:1] 2011-05-13 03:00:04,706 > HintedHandOffManager.java (line 192) Started hinted handoff for > endpoint /***.**.***.*54 > * INFO [GossipStage:1] 2011-05-13 03:00:04,706 Gossiper.java (line > 569) InetAddress /****.**.*****.54 is now UP* > INFO [HintedHandoff:1] 2011-05-13 03:00:04,706 > HintedHandOffManager.java (line 248) Finished hinted handoff of 0 rows > to endpoint /***.**.****.54 > INFO [HintedHandoff:1] 2011-05-13 03:00:20,601 > HintedHandOffManager.java (line 192) Started hinted handoff for > endpoint /***.**.****.59 > * INFO [GossipStage:1] 2011-05-13 03:00:20,601 Gossiper.java (line > 569) InetAddress /****.**.*****.59 is now UP* > > The exception occurred at "2011-05-13 03:00:00,664". I am wondering > what why this dead/up pattern is occurring at Gossip. > > Thanks in advance, > Cassa L. I am not sure do check system Load when you see this happening.If load is to High then it means CPU was not able to give time to Network I/O. -- S.Ali Ahsan --------------000405030507010907090902 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 05/18/2011 09:28 AM, Cassa L wrote:
Hi,
  I have 9 node cluster with RF-3 and using Cassandra0.70/Hector26. Recently we are seeing lot of "UnavailableException" at the client side. Whenever this happens, I found following pattern in Cassandra node's log file at that given time,

 INFO [ScheduledTasks:1] 2011-05-13 02:59:55,365 Gossiper.java (line 195) InetAddress /**.**.***.54 is now dead.
 INFO [ScheduledTasks:1] 2011-05-13 02:59:57,369 Gossiper.java (line 195) InetAddress /**.**.***.59 is now dead.
 INFO [HintedHandoff:1] 2011-05-13 03:00:04,706 HintedHandOffManager.java (line 192) Started hinted handoff for endpoint /**.**.***.54
 INFO [GossipStage:1] 2011-05-13 03:00:04,706 Gossiper.java (line 569) InetAddress /**.**.***.54 is now UP
 INFO [HintedHandoff:1] 2011-05-13 03:00:04,706 HintedHandOffManager.java (line 248) Finished hinted handoff of 0 rows to endpoint /**.**.***.54
 INFO [HintedHandoff:1] 2011-05-13 03:00:20,601 HintedHandOffManager.java (line 192) Started hinted handoff for endpoint /**.**.***.59
 INFO [GossipStage:1] 2011-05-13 03:00:20,601 Gossiper.java (line 569) InetAddress /**.**.***.59 is now UP
 
The exception occurred at "2011-05-13 03:00:00,664". I am wondering what why this dead/up pattern is occurring at Gossip.

Thanks in advance,
Cassa L.

I am not sure do check system Load when you see this happening.If load is to High then it means CPU was not able to give time to Network I/O.
-- 
S.Ali Ahsan


--------------000405030507010907090902--