From user-return-15430-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Apr 05 13:50:10 2011 Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 87498 invoked from network); 5 Apr 2011 13:50:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Apr 2011 13:50:10 -0000 Received: (qmail 89133 invoked by uid 500); 5 Apr 2011 13:50:08 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 89081 invoked by uid 500); 5 Apr 2011 13:50:08 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 89068 invoked by uid 99); 5 Apr 2011 13:50:07 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Apr 2011 13:50:07 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Apr 2011 13:50:00 +0000 Received: by vws12 with SMTP id 12so348350vws.31 for ; Tue, 05 Apr 2011 06:49:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=F6Ba/h1XPTmEg8huZ0KjoGa3hZv9Xyvc/kWEYe4eHUA=; b=H1GhgbiwTzxtgKJ4WJm++UYDuISaFduxESE1dGywmu/nFPtBJEbFVmhzeB2vrJq/VX Kdg2VmdPYrMLXOtctYHWrXnPmj+GHRi1nDLEKZOAO+fLP2KtkcVqHl7Bt59P/AMmu8DQ /LAxcP3bkYBGQKWtuOodvbm2PKudIOKz2757M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=ag8T6BqJiA1Fgf7x4ytlFcv8vgpXkuW7QATDWHHXfWNT9CgC8rPGgFVYIQ+T6ORLps 1mD2Soz1bLCFswavnmkpTG7+q4L2T00cObQTI9jVzBoh5stVlCwPcuVv27cfEftOqmyw CdMdADFArF0bK1QEmVYRDexmCbaLq5fzgHbRw= Received: by 10.52.99.135 with SMTP id eq7mr5756811vdb.100.1302011380100; Tue, 05 Apr 2011 06:49:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.169.228 with HTTP; Tue, 5 Apr 2011 06:49:20 -0700 (PDT) In-Reply-To: References: <4D999CBB.4070603@trioptima.com> <120CB7532EA53A4D8CA6B63F94B4ADB351E6D62840@IE2RD2XVS021.red002.local> From: Jonathan Ellis Date: Tue, 5 Apr 2011 08:49:20 -0500 Message-ID: Subject: Re: AW: Strange nodetool repair behaviour To: user@cassandra.apache.org Cc: aaron morton Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Sounds like https://issues.apache.org/jira/browse/CASSANDRA-2324 On Mon, Apr 4, 2011 at 7:46 AM, aaron morton wrot= e: > Jonas, AFAIK if repair completed successfully there should be no streamin= g the next time round. This sounds odd please look into it if you can. > > Can you run at DEBUG logging, there will be some messages about receiving= streams from files and which ranges are being requested. > > I would be interested to know if the repair is completing successfully. Y= ou should see messages such as "Repair session blah completed successfully"= =A0if it is. It is possible repair to hang if one of the neighbours goes a= way or fails to send the data. In this case the repair session will timeout= after 48 hours. > > Aaron > > On 4 Apr 2011, at 20:39, Roland Gude wrote: > >> I am experiencing the same behavior but had it on previous versions of 0= .7 as well. >> >> >> -----Urspr=FCngliche Nachricht----- >> Von: Jonas Borgstr=F6m [mailto:jonas.borgstrom@trioptima.com] >> Gesendet: Montag, 4. April 2011 12:26 >> An: user@cassandra.apache.org >> Betreff: Strange nodetool repair behaviour >> >> Hi, >> >> I have a 6 node 0.7.4 cluster with replication_factor=3D3 where "nodetoo= l >> repair keyspace" behaves really strange. >> >> The keyspace contains three column families and about 60GB data in total >> (i.e 30GB on each node). >> >> Even though no data has been added or deleted since the last repair, a >> repair takes hours and the repairing node seems to receive 100+GB worth >> of sstable data from its neighbourhood nodes, i.e several times the >> actual data size. >> >> The log says things like: >> >> "Performing streaming repair of 27 ranges" >> >> And a bunch of: >> >> "Compacted to 22,208,983,964 to 4,816,514,033 (~21% of origin= al)" >> >> In the end the repair finishes without any error after a few hours but >> even then the active sstables seems to contain lots of redundant data >> since the disk usage can be sliced in half by triggering a major compact= ion. >> >> All this leads me to believe that something stops the AES from correctly >> figuring out what data is already on the repairing node and what needs >> to be streamed from the neighbours. >> >> The only thing I can think of right now is that one of the column >> families contains a lot of large rows that are larger than >> memtable_throughput and that's perhaps what's confusing the merkle tree. >> >> Anyway, is this a known problem of perhaps expected behaviour? >> Otherwise I'll try to create a more reproducible test case. >> >> Regards, >> Jonas >> >> > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com