Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4B387688E for ; Wed, 22 Jun 2011 12:54:25 +0000 (UTC) Received: (qmail 39755 invoked by uid 500); 22 Jun 2011 12:54:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 39677 invoked by uid 500); 22 Jun 2011 12:54:23 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 39669 invoked by uid 99); 22 Jun 2011 12:54:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jun 2011 12:54:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sdolgy@gmail.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vx0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jun 2011 12:54:18 +0000 Received: by vxi40 with SMTP id 40so776971vxi.31 for ; Wed, 22 Jun 2011 05:53:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type:content-transfer-encoding; bh=ONMFPbV2W08rxUkYN1ZzAKiIn0GxxqkpGZRAzavEENs=; b=HV0ri83CB+/CtCkS3+4MBZ5QUWyv2n4acdr+r8VprBzZ69bIeMECp95q2kPTRGB0sD UJVJgYe+M33yMAAG4ic1Q81pNuFN5HnKiCdLkWyMIVl2ve97KkAq2s+QxkhxQT1zJWr1 tpLjTqWNCHzIRtLJ62CE4UxZ8+dv3poMz2gGk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=wJmCc+OQ2SBOErQnoMhVheXJfiZer0nCMPDsgN1cML/Pvc6/5Mi4NSjWGT1EjmBJkw LZr3FAXb2qd7A+5towfTtJGln4THWXNngFUj7q66/uvBrSsCzmueI48d1tf8ZFNVGooc 6tJc4Gdq5vn7HZ67R2NyeaycaghPZh8AvoF9g= Received: by 10.52.65.231 with SMTP id a7mr950993vdt.61.1308747237071; Wed, 22 Jun 2011 05:53:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.162.69 with HTTP; Wed, 22 Jun 2011 05:53:37 -0700 (PDT) In-Reply-To: References: From: Sasha Dolgy Date: Wed, 22 Jun 2011 14:53:37 +0200 Message-ID: Subject: Re: OOM (or, what settings to use on AWS large?) To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Yes ... this is because it was the OS that killed the process, and wasn't related to Cassandra "crashing". Reviewing our monitoring, we saw that memory utilization was pegged at 100% for days and days before it was finally killed because 'apt' was fighting for resource. At least, that's as far as I got in my investigation before giving up, moving to 0.8.0 and implementing 24hr nodetool repair on each node via cronjob....so far ... no problems. On Wed, Jun 22, 2011 at 2:49 PM, William Oberman wrote: > Well, I managed to run 50 days before an OOM, so any changes I make will > take a while to test ;-) =A0I've seen the=A0GCInspector log lines appear > periodically in my logs, but I didn't see a correlation with the crash. > I'll read the instructions on how to properly do a rolling upgrade today, > practice on test, and try that on production first. > will