Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 00EE1DC36 for ; Mon, 2 Jul 2012 15:43:55 +0000 (UTC) Received: (qmail 93313 invoked by uid 500); 2 Jul 2012 15:43:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 93287 invoked by uid 500); 2 Jul 2012 15:43:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 93279 invoked by uid 99); 2 Jul 2012 15:43:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 15:43:52 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [216.70.64.45] (HELO n15.mail01.mtsvc.net) (216.70.64.45) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 15:43:44 +0000 Received: from cl13.gs01.gridserver.com ([64.13.192.22]:50383) by n15.mail01.mtsvc.net with esmtps (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1SlimZ-0005cd-1C for user@cassandra.apache.org; Mon, 02 Jul 2012 11:43:23 -0400 Received: from c-24-7-86-20.hsd1.ca.comcast.net ([24.7.86.20]:19577 helo=[192.168.26.101]) by cl13.gs01.gridserver.com with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.69) (envelope-from ) id 1SlimY-0005EE-7Z for user@cassandra.apache.org; Mon, 02 Jul 2012 08:43:22 -0700 Message-ID: <4FF1C199.3080509@syncopated.net> Date: Mon, 02 Jul 2012 08:43:21 -0700 From: Deno Vichas User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: faillout from AWS outage - HELP Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-User: 32415 deno@syncopated.net X-MT-ID: d18d1aefa44d3b049b46df41b9e219702ba9cf79 X-Virus-Checked: Checked by ClamAV on apache.org all, my 4 node cluster seems pretty screwed up after the AWS outage. we found all our machines with their cpu stuck at 100%. so i went to restart each cassandra node one by one. i did node with token id 0 first. i came back but doesn't look like it doing anything. once i thought it was up i went and restarted the next. this one got stuck on the AMI init startup. i cancelled it, rebooted againg and now it't suck with "[INFO] 07/02/12-15:42:29 Received 0 of 1 responses from: " suggestions on how to fix this? thanks, deno