From user-return-27356-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Mon Jul 2 16:01:16 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2EF2ADCA3 for ; Mon, 2 Jul 2012 16:01:16 +0000 (UTC) Received: (qmail 42356 invoked by uid 500); 2 Jul 2012 16:01:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42335 invoked by uid 500); 2 Jul 2012 16:01:13 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42327 invoked by uid 99); 2 Jul 2012 16:01:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 16:01:13 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [64.207.129.76] (HELO mailout02.c01.mtsvc.net) (64.207.129.76) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 16:01:06 +0000 Received: from cl13.gs01.gridserver.com ([64.13.192.22]) by mailout02.c01.mtsvc.net with esmtp (Exim 4.72) (envelope-from ) id 1Slj3M-0001He-AF for user@cassandra.apache.org; Mon, 02 Jul 2012 09:00:44 -0700 Received: from c-24-7-86-20.hsd1.ca.comcast.net ([24.7.86.20]:19295 helo=[192.168.26.101]) by cl13.gs01.gridserver.com with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.69) (envelope-from ) id 1Slj3M-0008Jy-1j for user@cassandra.apache.org; Mon, 02 Jul 2012 09:00:44 -0700 Message-ID: <4FF1C5AB.40200@syncopated.net> Date: Mon, 02 Jul 2012 09:00:43 -0700 From: Deno Vichas User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: faillout from AWS outage - HELP References: <4FF1C199.3080509@syncopated.net> In-Reply-To: <4FF1C199.3080509@syncopated.net> Content-Type: multipart/alternative; boundary="------------060104050107040105080302" X-Authenticated-User: 32415 deno@syncopated.net X-Spam-Level: X-MT-INTERNAL-ID: d18d1aefa44d3b049b46df41b9e219702ba9cf79 X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: "score=-0.4 tests=ALL_TRUSTED, HTML_MESSAGE version=3.1.7" This is a multi-part message in MIME format. --------------060104050107040105080302 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit the node that doesn't want to start just spit out - /EC2 is experiencing some issues and has not allocated all of the resources in under 10 minutes. Aborting the clustering of this reservation. Please try again. Please visit http://datastax.com/ami for this AMI's feature set. / On 7/2/2012 8:43 AM, Deno Vichas wrote: > all, > > my 4 node cluster seems pretty screwed up after the AWS outage. we > found all our machines with their cpu stuck at 100%. so i went to > restart each cassandra node one by one. i did node with token id 0 > first. i came back but doesn't look like it doing anything. once i > thought it was up i went and restarted the next. this one got stuck > on the AMI init startup. i cancelled it, rebooted againg and now it't > suck with "[INFO] 07/02/12-15:42:29 Received 0 of 1 responses from: " > > suggestions on how to fix this? > > > thanks, > deno --------------060104050107040105080302 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit
the node that doesn't want to start just spit out -

EC2 is experiencing some issues and has not allocated all of the resources in under 10 minutes.
Aborting the clustering of this reservation. Please try again.

Please visit http://datastax.com/ami for this AMI's feature set.



On 7/2/2012 8:43 AM, Deno Vichas wrote:
all,

my 4 node cluster seems pretty screwed up after the AWS outage.  we found all our machines with their cpu stuck at 100%.  so i went to restart each cassandra node one by one.  i did node with token id 0 first.  i came back but doesn't look like it doing anything.  once i thought it was up i went and restarted the next.  this one got stuck on the AMI init startup.  i cancelled it, rebooted againg and now it't suck with "[INFO] 07/02/12-15:42:29 Received 0 of 1 responses from: "

suggestions on how to fix this?


thanks,
deno


--------------060104050107040105080302--