Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 18856C688 for ; Sun, 29 Apr 2012 10:46:30 +0000 (UTC) Received: (qmail 49789 invoked by uid 500); 29 Apr 2012 10:46:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 49411 invoked by uid 500); 29 Apr 2012 10:46:25 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 49397 invoked by uid 99); 29 Apr 2012 10:46:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 Apr 2012 10:46:24 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of raj.cassandra@gmail.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vx0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 Apr 2012 10:46:16 +0000 Received: by vcbfo1 with SMTP id fo1so1779457vcb.31 for ; Sun, 29 Apr 2012 03:45:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=SPdT0o3bt6dqWpXGwDVV/InGfg2bD+UiAy6ps6h/Zh8=; b=zLfqpzNwFiZb1MWU0RqldLQicwZnPOJzmwZyfVj1C4Zqxgq6wYX+A2Dzrt4hUH32Su LzRaxh1qYITjnfSX9prh9wdtkiBCBljYB3A0gLYtvvL8PB7IlQoH/TUgGulc1k10UE+N +dgLWBGLQKb5JE3qArF2eO/mLoeEPt3WwHhwV9dN+srucNubTddeY/xgHcDznr2C3IMN yhaEsBeYE/esR8XzeFfh4f7ywe7aYqUR+7s8RC3kz11uhhZopxaiwQ0T4+zJJ030oJye t1C2voUA+pmCr+BK/nj8bgsYvcg7FfcYnYEwUhab8Cw1KpZCGMTGZumumlYd1pscQvYU gE/Q== MIME-Version: 1.0 Received: by 10.220.58.197 with SMTP id i5mr17571457vch.38.1335696355734; Sun, 29 Apr 2012 03:45:55 -0700 (PDT) Received: by 10.220.179.75 with HTTP; Sun, 29 Apr 2012 03:45:55 -0700 (PDT) In-Reply-To: References: Date: Sun, 29 Apr 2012 06:45:55 -0400 Message-ID: Subject: Re: nodetool repair cassandra 0.8.4 HELP!!! From: Raj N To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0023544477cdcfb67f04becf0a25 --0023544477cdcfb67f04becf0a25 Content-Type: text/plain; charset=ISO-8859-1 I tried it on 1 column family. I believe there is a bug in 0.8* where repair ignores the cf. I tried this multiple times on different nodes. Every time the disk util was going uo to 80% on a 500 GB disk. I would eventually kill the repair. I only have 60GB worth data. I see this JIRA - https://issues.apache.org/jira/browse/CASSANDRA-2324 But that says it was fixed in 0.8 beta. Is this still broken in 0.8.4? I also don't understand why the data was inconsistent in the first place. I read and write at LOCAL_QUORUM. Thanks -Raj On Sun, Apr 29, 2012 at 2:06 AM, Watanabe Maki wrote: > You should run repair. If the disk space is the problem, try to cleanup > and major compact before repair. > You can limit the streaming data by running repair for each column family > separately. > > maki > > On 2012/04/28, at 23:47, Raj N wrote: > > > I have a 6 node cassandra cluster DC1=3, DC2=3 with 60 GB data on each > node. I was bulk loading data over the weekend. But we forgot to turn off > the weekly nodetool repair job. As a result, repair was interfering when we > were bulk loading data. I canceled repair by restarting the nodes. But > unfortunately after the restart it looks like I dont have any data on those > nodes when I use list on cassandra-cli. I ran repair on one of the effected > nodes, but repair seems to be taking forever. Disk space has almost > tripled. I stopped the repair again in fear of running out of disk space. > After restart, the disk space is at 50% where as the good nodes are at 25%. > How should I proceed from here. When I run list on cassandra-cli I do see > data on the effected node. But how can I be sure I have all the data. > Should I run repair again. Should I cleanup the disk by clearing snapshots. > Or should I just drop column families and bulk load the data again? > > > > Thanks > > -Raj > --0023544477cdcfb67f04becf0a25 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I tried it on 1 column family. I believe there is a bug in 0.8* where repai= r ignores the cf. I tried this multiple times on different nodes. Every tim= e the disk util was going uo to 80% on a 500 GB disk. I would eventually ki= ll the repair. I only have 60GB worth data. I see this JIRA -

=
But that says it was fixed in 0.8 beta. Is this still broken= in 0.8.4?

I also don't understand why the data was inconsiste= nt in the first place. I read and write at LOCAL_QUORUM.=A0

<= /div>
Thanks
-Raj

On Sun, A= pr 29, 2012 at 2:06 AM, Watanabe Maki <watanabe.maki@gmail.com&g= t; wrote:
You should run repair. If the disk space is = the problem, try to cleanup and major compact before repair.
You can limit the streaming data by running repair for each column family s= eparately.

maki

On 2012/04/28, at 23:47, Raj N <raj.cassandra@gmail.com> wrote:

> I have a 6 node cassandra cluster DC1=3D3, DC2=3D3 with 60 GB data on = each node. I was bulk loading data over the weekend. But we forgot to turn = off the weekly nodetool repair job. As a result, repair was interfering whe= n we were bulk loading data. I canceled repair by restarting the nodes. But= unfortunately after the restart it looks like I dont have any data on thos= e nodes when I use list on cassandra-cli. I ran repair on one of the effect= ed nodes, but repair seems to be taking forever. Disk space has almost trip= led. I stopped the repair again in fear of running out of disk space. After= restart, the disk space is at 50% where as the good nodes are at 25%. How = should I proceed from here. =A0When I run list on cassandra-cli I do see da= ta on the effected node. But how can I be sure I have all the data. Should = I run repair again. Should I cleanup the disk by clearing snapshots. Or sho= uld I just drop column families and bulk load the data again?
>
> Thanks
> -Raj

--0023544477cdcfb67f04becf0a25--