Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A6EA68317 for ; Sun, 14 Aug 2011 10:30:43 +0000 (UTC) Received: (qmail 83890 invoked by uid 500); 14 Aug 2011 10:30:41 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 83607 invoked by uid 500); 14 Aug 2011 10:30:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 83585 invoked by uid 99); 14 Aug 2011 10:30:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Aug 2011 10:30:27 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_FROM,HTML_FONT_FACE_BAD,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of watcherfr@gmail.com designates 209.85.210.48 as permitted sender) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Aug 2011 10:30:19 +0000 Received: by pzk34 with SMTP id 34so1599558pzk.21 for ; Sun, 14 Aug 2011 03:29:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=wSSMJHq3zzVtEpb8h9jnhtp2XfZTn1GWGx24fusMvYU=; b=uVyQE3KWqPk8f0pDJ2odcYTL5ij5tqmsfsfzmr5Hpzu2z8p1syIdLOM/9R0/936fxi 9DeSfny5Lnw+f04rWbSvOwJN2G4BfWqJnSIyPuFSRogfe539KIwlijIsZw4EjFuNW7Hp 5nhQmiVpDqVdwj74suVJh/xCwtqB/5IP0NzkA= MIME-Version: 1.0 Received: by 10.142.201.6 with SMTP id y6mr1144233wff.314.1313317799325; Sun, 14 Aug 2011 03:29:59 -0700 (PDT) Received: by 10.142.193.2 with HTTP; Sun, 14 Aug 2011 03:29:59 -0700 (PDT) Date: Sun, 14 Aug 2011 12:29:59 +0200 Message-ID: Subject: Unable to repair a node From: Philippe To: user Content-Type: multipart/alternative; boundary=000e0cd32be0e7f04504aa74a00a --000e0cd32be0e7f04504aa74a00a Content-Type: text/plain; charset=ISO-8859-1 Hello, I've been fighting with my cluster for a couple days now... Running 0.8.1.3, using Hector and loadblancing requests across all nodes. My question is : how do I get my node back under control so that it runs like the other two nodes. It's a 3 node, RF=3 cluster with reads & writes at LC=QUORUM, I only have counter columns inside super columns. There are 6 keyspaces, each has about 10 column families. I'm using the BOP. Before the sequence of events described below, I was writing at CL=ALL and reading at CL=ONE. I've launched repairs multiple times and they have failed for various reasons, one of them being hitting the limit on number of open files. I've raised it to 32768 now. I've probably launched repairs when a repair was already running on the node. At some point compactions were throttled to 16MB / s, I've removed this limit. The problem is that one of my nodes is now impossible to repair (no such problem with the two others). The load is about 90GB, it should be a balanced ring but the other nodes are at 60GB. Each repair basically generates thousands of pending compactions of various types (SSTable build, minor, major & validation) : it spikes up to 4000 thousands, levels then spikes up to 8000.Previously, I hit linux limits and had to restart the node but it doesn't look like the repairs have been improving anything time after time. At the same time, - the number of SSTables for some keyspaces goes dramatically up (from 3 or 4 to several dozens). - the commit log keeps increasing in size, I'm at 4.3G now, it went up to 40G when the compaction was throttled at 16MB/s. On the other nodes it's around 1GB at most - the data directory is bigger than on the other nodes. I've seen it go up to 480GB when the compaction was throttled at 16MB/s Compaction stats: pending tasks: 5954 compaction type keyspace column family bytes compacted bytes total progress ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY_17 569432689 596621002 95.44% MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_20 2751906910 5806164726 47.40% ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY_20 2570106876 2776508919 92.57% ValidationROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_19 3010471905 6517183774 46.19% MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_15 4132 303015882 0.00% MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_18 36302803 595278385 6.10% MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_17 24671866 70959088 34.77% MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_20 15515781 692029872 2.24% MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_20 1607953684 6606774662 24.34% ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY_20 895043380 2776306015 32.24% My current lsof count for the cassandra user is root@xxx:/logs/cassandra# lsof -u cassandra| wc -l 13191 What's even weirder is that currently I have 9 compactions running but CPU is throttled at 1/number of cores half the time (while > 80% the rest of the time). Could this be because other repairs are happening in the ring ? Exemple (vmstat 2) 7 2 0 177632 1596 13868416 0 0 9060 61 5963 5968 40 7 53 0 7 0 0 165376 1600 13880012 0 0 41422 28 14027 4608 81 17 1 0 8 0 0 159820 1592 13880036 0 0 26830 22 10161 10398 76 19 4 1 6 0 0 161792 1592 13882312 0 0 20046 42 7272 4599 81 17 2 0 2 0 0 164960 1564 13879108 0 0 17404 26559 6172 3638 79 18 2 0 2 0 0 162344 1564 13867888 0 0 6 0 2014 2150 40 2 58 0 1 1 0 159864 1572 13867952 0 0 0 41668 958 581 27 0 72 1 1 0 0 161972 1572 13867952 0 0 0 89 661 443 17 0 82 1 1 0 0 162128 1572 13867952 0 0 0 20 482 398 17 0 83 0 2 0 0 162276 1572 13867952 0 0 0 788 485 395 18 0 82 0 1 0 0 173896 1572 13867952 0 0 0 29 547 461 17 0 83 0 1 0 0 163052 1572 13867920 0 0 0 0 741 620 18 1 81 0 1 0 0 162588 1580 13867948 0 0 0 32 523 387 17 0 82 0 13 0 0 168272 1580 13877140 0 0 12872 269 8056 6725 56 9 34 0 44 1 0 202536 1612 13835956 0 0 26606 530 7946 3887 79 19 2 0 48 1 0 406640 1612 13631740 0 0 22006 310 8605 3705 80 18 2 0 9 1 0 340300 1620 13697560 0 0 19530 103 8101 3984 84 14 1 0 2 0 0 297768 1620 13738036 0 0 12438 10 4115 2628 57 9 34 0 Thanks --000e0cd32be0e7f04504aa74a00a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello, I've been fighting with my cluster for a couple days now... Runn= ing 0.8.1.3, using Hector and loadblancing requests across all nodes.
<= div>My question is : how do I get my node back under control so that it run= s like the other two nodes.


It's a 3 node, RF=3D3 cluster = with reads & writes at LC=3DQUORUM, I only have counter columns inside = super columns. There are 6 keyspaces, each has about 10 column families. I&= #39;m using the BOP. Before the sequence of events described below, I was w= riting at CL=3DALL and reading at CL=3DONE.=A0I've launched repairs mul= tiple times and they have failed for various reasons, one of them being hit= ting the limit on number of open files. I've raised it to 32768 now. I&= #39;ve probably launched repairs when a repair was already running on the n= ode. At some point compactions were throttled to 16MB / s, I've removed= this limit.

The problem is that =A0one of my nodes is now impossible to = repair (no such problem with the two others). The load is about 90GB, it sh= ould be a balanced ring but the other nodes are at 60GB.=A0Each repair basi= cally generates thousands of pending compactions of various types (SSTable = build, minor, major & validation) : it spikes up to 4000 thousands, lev= els then spikes up to 8000.Previously, I hit linux limits and had to restar= t the node but it doesn't look like the repairs have been improving any= thing time after time.
At the same time,=A0
  • the number of SSTables for some= keyspaces goes dramatically up (from 3 or 4 to several dozens).
  • th= e commit log keeps increasing in size, I'm at 4.3G now, it went up to 4= 0G when the compaction was throttled at 16MB/s. On the other nodes it's= around 1GB at most
  • the data directory is bigger than on the other nodes. I've seen it = go up to 480GB=A0when the compaction was throttled at 16MB/s

Compaction stats:
pending tasks: 5954
=A0=A0 =A0 =A0 =A0 =A0compaction type =A0 =A0 =A0 =A0keyspace =A0 colu= mn family bytes compacted =A0 =A0 bytes total =A0progress
= = =A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY= _17 =A0 =A0 =A0 569432689 =A0 =A0 =A0 596621002 =A0 =A095.44%
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAGEPUB= LIC_MONTHLY_20 =A0 =A0 =A02751906910 =A0 =A0 =A05806164726 =A0 =A047.40%
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_WIFI_COVERAG= EPUBLIC_MONTHLY_20 =A0 =A0 =A02570106876 =A0 =A0 =A02776508919 =A0 =A092.57= %
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_CDMA_COVERAGEPUBLIC_MO= NTHLY_19 =A0 =A0 =A03010471905 =A0 =A0 =A06517183774 =A0 =A046.19%
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAG= EPUBLIC_MONTHLY_15 =A0 =A0 =A0 =A0 =A0 =A04132 =A0 =A0 =A0 303015882 =A0 = =A0 0.00%
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAGEPUB= LIC_MONTHLY_18 =A0 =A0 =A0 =A036302803 =A0 =A0 =A0 595278385 =A0 =A0 6.10%<= /font>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA= _COVERAGEPUBLIC_MONTHLY_17 =A0 =A0 =A0 =A024671866 =A0 =A0 =A0 =A070959088 = =A0 =A034.77%
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAGEPUB= LIC_MONTHLY_20 =A0 =A0 =A0 =A015515781 =A0 =A0 =A0 692029872 =A0 =A0 2.24%<= /font>
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA= _COVERAGEPUBLIC_MONTHLY_20 =A0 =A0 =A01607953684 =A0 =A0 =A06606774662 =A0 = =A024.34%
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_WIFI_COVERAGEPUBLIC_MO= NTHLY_20 =A0 =A0 =A0 895043380 =A0 =A0 =A02776306015 =A0 =A032.24%

My current lsof count for the cassandra user = is=A0
root@xxx:/logs/cassandra# lsof -u cassandra| wc -l
13191

What's even weirder is that currently I have = 9 compactions running but CPU is throttled at 1/number of cores half the ti= me (while > 80% the rest of the time). Could this be because other repai= rs are happening in the ring ?
Exemple (vmstat 2)
=A07 =A02 =A0 =A0 =A00 177632 =A0 15= 96 13868416 =A0 =A00 =A0 =A00 =A09060 =A0 =A061 5963 5968 40 =A07 53 =A00
=A07 =A00 =A0 =A0 =A00 165376 =A0 1600 13880012 =A0 =A00 = =A0 =A00 41422 =A0 =A028 14027 4608 81 17 =A01 =A00
=A08 =A00 =A0 =A0 =A00 159820 =A0 1592 13880036 =A0 =A00 =A0 =A00 2683= 0 =A0 =A022 10161 10398 76 19 =A04 =A01
=A06 =A00 =A0 =A0 = =A00 161792 =A0 1592 13882312 =A0 =A00 =A0 =A00 20046 =A0 =A042 7272 4599 8= 1 17 =A02 =A00
=A02 =A00 =A0 =A0 =A00 164960 =A0 1564 13879108 =A0 =A00 =A0 =A00 1740= 4 26559 6172 3638 79 18 =A02 =A00
=A02 =A00 =A0 =A0 =A00 1= 62344 =A0 1564 13867888 =A0 =A00 =A0 =A00 =A0 =A0 6 =A0 =A0 0 2014 2150 40 = =A02 58 =A00
=A01 =A01 =A0 =A0 =A00 159864 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 = =A0 0 41668 =A0958 =A0581 27 =A00 72 =A01
=A01 =A00 =A0 = =A0 =A00 161972 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =A0 0 =A0 =A089 =A0= 661 =A0443 17 =A00 82 =A01
=A01 =A00 =A0 =A0 =A00 162128 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 = =A0 0 =A0 =A020 =A0482 =A0398 17 =A00 83 =A00
=A02 =A00 = =A0 =A0 =A00 162276 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =A0 0 =A0 788 = =A0485 =A0395 18 =A00 82 =A00
=A01 =A00 =A0 =A0 =A00 173896 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 = =A0 0 =A0 =A029 =A0547 =A0461 17 =A00 83 =A00
=A01 =A00 = =A0 =A0 =A00 163052 =A0 1572 13867920 =A0 =A00 =A0 =A00 =A0 =A0 0 =A0 =A0 0= =A0741 =A0620 18 =A01 81 =A00
=A01 =A00 =A0 =A0 =A00 162588 =A0 1580 13867948 =A0 =A00 =A0 =A00 =A0 = =A0 0 =A0 =A032 =A0523 =A0387 17 =A00 82 =A00
13 =A00 =A0= =A0 =A00 168272 =A0 1580 13877140 =A0 =A00 =A0 =A00 12872 =A0 269 8056 672= 5 56 =A09 34 =A00
44 =A01 =A0 =A0 =A00 202536 =A0 1612 13835956 =A0 =A00 =A0 =A00 26606 = =A0 530 7946 3887 79 19 =A02 =A00
48 =A01 =A0 =A0 =A00 406= 640 =A0 1612 13631740 =A0 =A00 =A0 =A00 22006 =A0 310 8605 3705 80 18 =A02 = =A00
=A09 =A01 =A0 =A0 =A00 340300 =A0 1620 13697560 =A0 =A00 =A0 =A00 1953= 0 =A0 103 8101 3984 84 14 =A01 =A00
=A02 =A00 =A0 =A0 =A00= 297768 =A0 1620 13738036 =A0 =A00 =A0 =A00 12438 =A0 =A010 4115 2628 57 = =A09 34 =A00

Thanks
--000e0cd32be0e7f04504aa74a00a--