Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 60759 invoked from network); 30 Mar 2011 17:26:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Mar 2011 17:26:05 -0000 Received: (qmail 45329 invoked by uid 500); 30 Mar 2011 17:26:03 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 45300 invoked by uid 500); 30 Mar 2011 17:26:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 45292 invoked by uid 99); 30 Mar 2011 17:26:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Mar 2011 17:26:03 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bw0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Mar 2011 17:25:57 +0000 Received: by bwz13 with SMTP id 13so1233194bwz.31 for ; Wed, 30 Mar 2011 10:25:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=6Gdcqv/6BwABbIElRqSILiikyoMwiCr1uLikb0JXWlQ=; b=gb/Y5XwU/jaLvfqfGvhtACu9rMwXegb07PZVnuqYjZmZ/6EYSZdszET8m/KT0vMysG DORpN1bW9SY8uO8/cdnG6QjbRUaOJ6VUQUDpMTnRaYAupNXAfIL5zII6anUwpT52gS7r hEKNBuj1zY1pkZPZNbtamzKXblUpTEtxZEQlY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=amNg+zvCHSZ7Dop8u8Zyo50IEd8iNHju8qsPtzJoOnBNevEOvSTTvUldUqrosNTzVj hZriyYYvgVaQlTYZCfgH42N9EOsouOOH1qFxKIx7s55QbErotVAkyKBb2XEPirWUayCy WKhKPw0/eyk8y1/AsfPT+zZ91UCxzcaXEBTco= MIME-Version: 1.0 Received: by 10.204.170.193 with SMTP id e1mr1279260bkz.136.1301505935203; Wed, 30 Mar 2011 10:25:35 -0700 (PDT) Received: by 10.204.65.196 with HTTP; Wed, 30 Mar 2011 10:25:35 -0700 (PDT) In-Reply-To: References: <1301420212192-6220171.post@n2.nabble.com> <1301421070734-6220228.post@n2.nabble.com> <1301424132945-6220423.post@n2.nabble.com> <1301428784487-6220683.post@n2.nabble.com> <1301434571427-6221041.post@n2.nabble.com> <1301436482271-6221157.post@n2.nabble.com> <4D92DBEF.2050105@hiramoto.org> Date: Wed, 30 Mar 2011 13:25:35 -0400 Message-ID: Subject: Re: How to determine if repair need to be run From: Edward Capriolo To: user@cassandra.apache.org Cc: Peter Schuller , Karl Hiramoto Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Mar 30, 2011 at 12:54 PM, Peter Schuller wrote: >> Note this script doesn't work if your repair takes hours, and in the >> middle of the repair cassandra was restarted, nodetool will exit and the >> flagfile will be updated. =A0 Another case, if repair hangs, and day lat= er >> cassandra is restarted. > > This is why "set -e" is at the to and commented as "important" :) But > it relies on 'nodetool repair' reliably exiting with non-zero exit > status on failures. > >> if nodetool returns an error this might work: >> >> =A0nodetool -h localhost repair && touch /path/to/flagfile.tmp > > That's the equivalent, due to 'set -e'. > > > -- > / Peter Schuller > I just wanted to chime in here and say some people NEVER run repair. In our particular case we remove inactive data older then a specific date. If we lost a tombstone and that data were to re-appear that would really not be the end of the world for us. Repair is really intensive since it involves a compaction and in 0.6.X was not optimal as it really increased on disk data. I have followed some threads and there are some conditions that I read repair can't handle. The question you have to ask yourself is how likely are they to occur and what they might mean in your use-case. These are not easy questions to answer.