Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of watcherfr@gmail.com
 designates 209.85.210.48 as permitted sender)
MIME-Version: 1.0
Date: Sun, 14 Aug 2011 12:29:59 +0200
Message-ID: 
 <CAHwsXYn5rfBKQbJKxFA5Ae=ciPU3GSBHvgnfC8wviZJU2ah1uw@mail.gmail.com>
Subject: Unable to repair a node
From: Philippe <watcherfr@gmail.com>
To: user <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=000e0cd32be0e7f04504aa74a00a

--000e0cd32be0e7f04504aa74a00a
Content-Type: text/plain; charset=ISO-8859-1

Hello, I've been fighting with my cluster for a couple days now... Running
0.8.1.3, using Hector and loadblancing requests across all nodes.
My question is : how do I get my node back under control so that it runs
like the other two nodes.


It's a 3 node, RF=3 cluster with reads & writes at LC=QUORUM, I only have
counter columns inside super columns. There are 6 keyspaces, each has about
10 column families. I'm using the BOP. Before the sequence of events
described below, I was writing at CL=ALL and reading at CL=ONE. I've
launched repairs multiple times and they have failed for various reasons,
one of them being hitting the limit on number of open files. I've raised it
to 32768 now. I've probably launched repairs when a repair was already
running on the node. At some point compactions were throttled to 16MB / s,
I've removed this limit.

The problem is that  one of my nodes is now impossible to repair (no such
problem with the two others). The load is about 90GB, it should be a
balanced ring but the other nodes are at 60GB. Each repair basically
generates thousands of pending compactions of various types (SSTable build,
minor, major & validation) : it spikes up to 4000 thousands, levels then
spikes up to 8000.Previously, I hit linux limits and had to restart the node
but it doesn't look like the repairs have been improving anything time after
time.
At the same time,

   - the number of SSTables for some keyspaces goes dramatically up (from 3
   or 4 to several dozens).
   - the commit log keeps increasing in size, I'm at 4.3G now, it went up to
   40G when the compaction was throttled at 16MB/s. On the other nodes it's
   around 1GB at most
   - the data directory is bigger than on the other nodes. I've seen it go
   up to 480GB when the compaction was throttled at 16MB/s


Compaction stats:
pending tasks: 5954
          compaction type        keyspace   column family bytes compacted
  bytes total  progress
               ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY_17
569432689       596621002    95.44%
                    MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_20
 2751906910      5806164726    47.40%
               ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY_20
 2570106876      2776508919    92.57%
               ValidationROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_19
 3010471905      6517183774    46.19%
                    MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_15
 4132       303015882     0.00%
                    MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_18
 36302803       595278385     6.10%
                    MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_17
 24671866        70959088    34.77%
                    MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_20
 15515781       692029872     2.24%
                    MinorROLLUP_CDMA_COVERAGEPUBLIC_MONTHLY_20
 1607953684      6606774662    24.34%
               ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY_20
895043380      2776306015    32.24%

My current lsof count for the cassandra user is
root@xxx:/logs/cassandra# lsof -u cassandra| wc -l
13191

What's even weirder is that currently I have 9 compactions running but CPU
is throttled at 1/number of cores half the time (while > 80% the rest of the
time). Could this be because other repairs are happening in the ring ?
Exemple (vmstat 2)
 7  2      0 177632   1596 13868416    0    0  9060    61 5963 5968 40  7 53
 0
 7  0      0 165376   1600 13880012    0    0 41422    28 14027 4608 81 17
 1  0
 8  0      0 159820   1592 13880036    0    0 26830    22 10161 10398 76 19
 4  1
 6  0      0 161792   1592 13882312    0    0 20046    42 7272 4599 81 17  2
 0
 2  0      0 164960   1564 13879108    0    0 17404 26559 6172 3638 79 18  2
 0
 2  0      0 162344   1564 13867888    0    0     6     0 2014 2150 40  2 58
 0
 1  1      0 159864   1572 13867952    0    0     0 41668  958  581 27  0 72
 1
 1  0      0 161972   1572 13867952    0    0     0    89  661  443 17  0 82
 1
 1  0      0 162128   1572 13867952    0    0     0    20  482  398 17  0 83
 0
 2  0      0 162276   1572 13867952    0    0     0   788  485  395 18  0 82
 0
 1  0      0 173896   1572 13867952    0    0     0    29  547  461 17  0 83
 0
 1  0      0 163052   1572 13867920    0    0     0     0  741  620 18  1 81
 0
 1  0      0 162588   1580 13867948    0    0     0    32  523  387 17  0 82
 0
13  0      0 168272   1580 13877140    0    0 12872   269 8056 6725 56  9 34
 0
44  1      0 202536   1612 13835956    0    0 26606   530 7946 3887 79 19  2
 0
48  1      0 406640   1612 13631740    0    0 22006   310 8605 3705 80 18  2
 0
 9  1      0 340300   1620 13697560    0    0 19530   103 8101 3984 84 14  1
 0
 2  0      0 297768   1620 13738036    0    0 12438    10 4115 2628 57  9 34
 0

Thanks

--000e0cd32be0e7f04504aa74a00a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hello, I&#39;ve been fighting with my cluster for a couple days now... Runn=
ing 0.8.1.3, using Hector and loadblancing requests across all nodes.<div><=
div>My question is : how do I get my node back under control so that it run=
s like the other two nodes.</div>
</div><div><br></div><div><br></div><div>It&#39;s a 3 node, RF=3D3 cluster =
with reads &amp; writes at LC=3DQUORUM, I only have counter columns inside =
super columns. There are 6 keyspaces, each has about 10 column families. I&=
#39;m using the BOP. Before the sequence of events described below, I was w=
riting at CL=3DALL and reading at CL=3DONE.=A0I&#39;ve launched repairs mul=
tiple times and they have failed for various reasons, one of them being hit=
ting the limit on number of open files. I&#39;ve raised it to 32768 now. I&=
#39;ve probably launched repairs when a repair was already running on the n=
ode. At some point compactions were throttled to 16MB / s, I&#39;ve removed=
 this limit.<div>
<br></div><div>The problem is that =A0one of my nodes is now impossible to =
repair (no such problem with the two others). The load is about 90GB, it sh=
ould be a balanced ring but the other nodes are at 60GB.=A0Each repair basi=
cally generates thousands of pending compactions of various types (SSTable =
build, minor, major &amp; validation) : it spikes up to 4000 thousands, lev=
els then spikes up to 8000.Previously, I hit linux limits and had to restar=
t the node but it doesn&#39;t look like the repairs have been improving any=
thing time after time.</div>
<div>At the same time,=A0</div><div><ul><li>the number of SSTables for some=
 keyspaces goes dramatically up (from 3 or 4 to several dozens).</li><li>th=
e commit log keeps increasing in size, I&#39;m at 4.3G now, it went up to 4=
0G when the compaction was throttled at 16MB/s. On the other nodes it&#39;s=
 around 1GB at most</li>
<li>the data directory is bigger than on the other nodes. I&#39;ve seen it =
go up to 480GB=A0when the compaction was throttled at 16MB/s</li></ul></div=
><div><br></div><div>Compaction stats:</div><div><div><font class=3D"Apple-=
style-span" face=3D"&#39;courier new&#39;, monospace">pending tasks: 5954</=
font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A0=A0 =A0 =A0 =A0 =A0compaction type =A0 =A0 =A0 =A0keyspace =A0 colu=
mn family bytes compacted =A0 =A0 bytes total =A0progress</font></div><div>=
<font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monospace">=
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_WIFI_COVERAGEPUBLIC_MONTHLY=
_17 =A0 =A0 =A0 569432689 =A0 =A0 =A0 596621002 =A0 =A095.44%</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAGEPUB=
LIC_MONTHLY_20 =A0 =A0 =A02751906910 =A0 =A0 =A05806164726 =A0 =A047.40%</f=
ont></div><div><font class=3D"Apple-style-span" face=3D"&#39;courier new=
9;, monospace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_WIFI_COVERAG=
EPUBLIC_MONTHLY_20 =A0 =A0 =A02570106876 =A0 =A0 =A02776508919 =A0 =A092.57=
%</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_CDMA_COVERAGEPUBLIC_MO=
NTHLY_19 =A0 =A0 =A03010471905 =A0 =A0 =A06517183774 =A0 =A046.19%</font></=
div><div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, mo=
nospace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAG=
EPUBLIC_MONTHLY_15 =A0 =A0 =A0 =A0 =A0 =A04132 =A0 =A0 =A0 303015882 =A0 =
=A0 0.00%</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAGEPUB=
LIC_MONTHLY_18 =A0 =A0 =A0 =A036302803 =A0 =A0 =A0 595278385 =A0 =A0 6.10%<=
/font></div><div><font class=3D"Apple-style-span" face=3D"&#39;courier new&=
#39;, monospace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA=
_COVERAGEPUBLIC_MONTHLY_17 =A0 =A0 =A0 =A024671866 =A0 =A0 =A0 =A070959088 =
=A0 =A034.77%</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA_COVERAGEPUB=
LIC_MONTHLY_20 =A0 =A0 =A0 =A015515781 =A0 =A0 =A0 692029872 =A0 =A0 2.24%<=
/font></div><div><font class=3D"Apple-style-span" face=3D"&#39;courier new&=
#39;, monospace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MinorROLLUP_CDMA=
_COVERAGEPUBLIC_MONTHLY_20 =A0 =A0 =A01607953684 =A0 =A0 =A06606774662 =A0 =
=A024.34%</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 ValidationROLLUP_WIFI_COVERAGEPUBLIC_MO=
NTHLY_20 =A0 =A0 =A0 895043380 =A0 =A0 =A02776306015 =A0 =A032.24%</font></=
div></div><div><br></div><div>My current lsof count for the cassandra user =
is=A0</div>
<div><div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, m=
onospace">root@xxx:/logs/cassandra# lsof -u cassandra| wc -l</font></div><d=
iv><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monospac=
e">13191</font></div>
</div><div><br></div><div>What&#39;s even weirder is that currently I have =
9 compactions running but CPU is throttled at 1/number of cores half the ti=
me (while &gt; 80% the rest of the time). Could this be because other repai=
rs are happening in the ring ?</div>
<div>Exemple (vmstat 2)</div><div><div><font class=3D"Apple-style-span" fac=
e=3D"&#39;courier new&#39;, monospace">=A07 =A02 =A0 =A0 =A00 177632 =A0 15=
96 13868416 =A0 =A00 =A0 =A00 =A09060 =A0 =A061 5963 5968 40 =A07 53 =A00</=
font></div><div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#=
39;, monospace">=A07 =A00 =A0 =A0 =A00 165376 =A0 1600 13880012 =A0 =A00 =
=A0 =A00 41422 =A0 =A028 14027 4608 81 17 =A01 =A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A08 =A00 =A0 =A0 =A00 159820 =A0 1592 13880036 =A0 =A00 =A0 =A00 2683=
0 =A0 =A022 10161 10398 76 19 =A04 =A01</font></div><div><font class=3D"App=
le-style-span" face=3D"&#39;courier new&#39;, monospace">=A06 =A00 =A0 =A0 =
=A00 161792 =A0 1592 13882312 =A0 =A00 =A0 =A00 20046 =A0 =A042 7272 4599 8=
1 17 =A02 =A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A02 =A00 =A0 =A0 =A00 164960 =A0 1564 13879108 =A0 =A00 =A0 =A00 1740=
4 26559 6172 3638 79 18 =A02 =A00</font></div><div><font class=3D"Apple-sty=
le-span" face=3D"&#39;courier new&#39;, monospace">=A02 =A00 =A0 =A0 =A00 1=
62344 =A0 1564 13867888 =A0 =A00 =A0 =A00 =A0 =A0 6 =A0 =A0 0 2014 2150 40 =
=A02 58 =A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A01 =A01 =A0 =A0 =A00 159864 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =
=A0 0 41668 =A0958 =A0581 27 =A00 72 =A01</font></div><div><font class=3D"A=
pple-style-span" face=3D"&#39;courier new&#39;, monospace">=A01 =A00 =A0 =
=A0 =A00 161972 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =A0 0 =A0 =A089 =A0=
661 =A0443 17 =A00 82 =A01</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A01 =A00 =A0 =A0 =A00 162128 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =
=A0 0 =A0 =A020 =A0482 =A0398 17 =A00 83 =A00</font></div><div><font class=
=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monospace">=A02 =A00 =
=A0 =A0 =A00 162276 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =A0 0 =A0 788 =
=A0485 =A0395 18 =A00 82 =A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A01 =A00 =A0 =A0 =A00 173896 =A0 1572 13867952 =A0 =A00 =A0 =A00 =A0 =
=A0 0 =A0 =A029 =A0547 =A0461 17 =A00 83 =A00</font></div><div><font class=
=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monospace">=A01 =A00 =
=A0 =A0 =A00 163052 =A0 1572 13867920 =A0 =A00 =A0 =A00 =A0 =A0 0 =A0 =A0 0=
 =A0741 =A0620 18 =A01 81 =A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A01 =A00 =A0 =A0 =A00 162588 =A0 1580 13867948 =A0 =A00 =A0 =A00 =A0 =
=A0 0 =A0 =A032 =A0523 =A0387 17 =A00 82 =A00</font></div><div><font class=
=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monospace">13 =A00 =A0=
 =A0 =A00 168272 =A0 1580 13877140 =A0 =A00 =A0 =A00 12872 =A0 269 8056 672=
5 56 =A09 34 =A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">44 =A01 =A0 =A0 =A00 202536 =A0 1612 13835956 =A0 =A00 =A0 =A00 26606 =
=A0 530 7946 3887 79 19 =A02 =A00</font></div><div><font class=3D"Apple-sty=
le-span" face=3D"&#39;courier new&#39;, monospace">48 =A01 =A0 =A0 =A00 406=
640 =A0 1612 13631740 =A0 =A00 =A0 =A00 22006 =A0 310 8605 3705 80 18 =A02 =
=A00</font></div>
<div><font class=3D"Apple-style-span" face=3D"&#39;courier new&#39;, monosp=
ace">=A09 =A01 =A0 =A0 =A00 340300 =A0 1620 13697560 =A0 =A00 =A0 =A00 1953=
0 =A0 103 8101 3984 84 14 =A01 =A00</font></div><div><font class=3D"Apple-s=
tyle-span" face=3D"&#39;courier new&#39;, monospace">=A02 =A00 =A0 =A0 =A00=
 297768 =A0 1620 13738036 =A0 =A00 =A0 =A00 12438 =A0 =A010 4115 2628 57 =
=A09 34 =A00</font></div>
</div><div><br></div><div>Thanks</div></div>

--000e0cd32be0e7f04504aa74a00a--