From user-return-25291-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Apr 3 11:56:08 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9746A92C6 for ; Tue, 3 Apr 2012 11:56:08 +0000 (UTC) Received: (qmail 51860 invoked by uid 500); 3 Apr 2012 11:56:06 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51775 invoked by uid 500); 3 Apr 2012 11:56:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51764 invoked by uid 99); 3 Apr 2012 11:56:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Apr 2012 11:56:06 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nuno-m-jordao@telecom.pt designates 83.240.175.146 as permitted sender) Received: from [83.240.175.146] (HELO PTPEDGE03.ptportugal.corppt.com) (83.240.175.146) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Apr 2012 11:56:01 +0000 From: Nuno Jordao To: "user@cassandra.apache.org" Date: Tue, 3 Apr 2012 12:55:34 +0100 Subject: RE: Repair in loop? Thread-Topic: Repair in loop? Thread-Index: Ac0RjlPAqh3UzDYsTy+4v891zX7LQgAAfccA Message-ID: <125FE5C84575394CB9ED84D5DBB151D5BD1A8CB064@PTPPICEX03.PTPortugal.corpPT.com> References: <125FE5C84575394CB9ED84D5DBB151D5BD1A8CB057@PTPPICEX03.PTPortugal.corpPT.com> <125FE5C84575394CB9ED84D5DBB151D5BD1A8CB059@PTPPICEX03.PTPortugal.corpPT.com> In-Reply-To: Accept-Language: pt-PT, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: pt-PT, en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Ok, Thank you! :) One last question then, is "nodetool repair -pr" enough to recover a failed= node? Nuno -----Original Message----- From: Sylvain Lebresne [mailto:sylvain@datastax.com]=20 Sent: ter=E7a-feira, 3 de Abril de 2012 12:38 To: user@cassandra.apache.org Subject: Re: Repair in loop? Importance: Low On Tue, Apr 3, 2012 at 12:52 PM, Nuno Jordao wro= te: > Thank you for your response. > My question is that it is repeating the same column family: > > INFO 19:12:24,656 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] BlockDat= a_b6 is fully synced (255 remaining column family to sync for this session) > [...] > INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] BlockDat= a_b6 is fully synced (255 remaining column family to sync for this session) > > What I was showing in my previous email is the point where it restarted: Ok, then it's likely because because those correspond to different ranges of the ring. Unless you've started the repair with "nodetool repair -pr", the repair will try to repair every range of the node and each repair will a different repair session. I'll admit though that printing which range is being repaired would have avoid that confusion. -- Sylvain > > INFO 09:54:51,112 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] BlockDat= a_e8 is fully synced (1 remaining column family to sync for this session) > INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] BlockDat= a_b6 is fully synced (255 remaining column family to sync for this session) > > Notice the "1 remaining column family to sync for this session" indicatio= n changes to "255 remaining column family to sync for this session". > > Regards, > > Nuno Jord=E3o > > -----Original Message----- > From: Sylvain Lebresne [mailto:sylvain@datastax.com] > Sent: ter=E7a-feira, 3 de Abril de 2012 11:36 > To: user@cassandra.apache.org > Subject: Re: Repair in loop? > Importance: Low > > It just means that you have lots of column family and repair does 1 > column family at a time. Each line is just saying it's done with one > of the column family. There is nothing wrong, but it does mean the > repair is *not* done yet. > > -- > Sylvain > > On Tue, Apr 3, 2012 at 12:28 PM, Nuno Jordao w= rote: >> Hello, >> >> >> >> I'm doing some test with cassandra 1.0.8 using multiple data directories >> with individual disks in a three node cluster (replica=3D3). >> >> One of the tests was to replace a couple of disks and start a repair >> process. >> >> It started ok and refilled the disks but I noticed that after the recove= ry >> process finished, it started a new one again: >> >> >> >> INFO 09:34:42,481 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_6f is fully synced (6 remaining column family to sync for this >> session) >> >> INFO 09:41:55,288 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_0d is fully synced (5 remaining column family to sync for this >> session) >> >> INFO 09:42:50,169 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_07 is fully synced (4 remaining column family to sync for this >> session) >> >> INFO 09:45:02,743 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_5a is fully synced (3 remaining column family to sync for this >> session) >> >> INFO 09:48:03,010 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_da is fully synced (2 remaining column family to sync for this >> session) >> >> INFO 09:54:51,112 [repair #69c95b50-7cee-11e1-0000-6b5cbd036faf] >> BlockData_e8 is fully synced (1 remaining column family to sync for this >> session) >> >> INFO 10:03:50,269 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_b6 is fully synced (255 remaining column family to sync for th= is >> session) >> >> INFO 10:05:42,803 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_13 is fully synced (254 remaining column family to sync for th= is >> session) >> >> INFO 10:08:43,354 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_8b is fully synced (253 remaining column family to sync for th= is >> session) >> >> INFO 10:12:09,599 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_31 is fully synced (252 remaining column family to sync for th= is >> session) >> >> INFO 10:15:43,426 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_0c is fully synced (251 remaining column family to sync for th= is >> session) >> >> INFO 10:21:47,156 [repair #a66c8240-7d6a-11e1-0000-6b5cbd036faf] >> BlockData_1b is fully synced (250 remaining column family to sync for th= is >> session) >> >> >> >> Is this normal? To me it doesn't make much sense. >> >> >> >> Regards, >> >> >> >> Nuno