Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of watcherfr@gmail.com designates
 209.85.212.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAHwsXYn37Qu3nTrctP8Z+ow6pazm1o7fGThJnVpg=w3bBR0fBQ@mail.gmail.com>
References: 
 <CAHwsXYn5rfBKQbJKxFA5Ae=ciPU3GSBHvgnfC8wviZJU2ah1uw@mail.gmail.com>
	<CAO5xsd2wa1DF1CsKjPyU+1N6pQD9ny_sGi7HWYmFfNkQ=VCS4A@mail.gmail.com>
	<CAHwsXY=zaw9NpQKaUnKxDa1cU0bReYNC88ZeKvq0eLFRNYz7pg@mail.gmail.com>
	<CAHwsXYkMjH5JPdrR-K1KFtj2yeCWmQayCTbjN2+N9At8XMUakA@mail.gmail.com>
	<CAHwsXYmCdxYKWCCWjDvoP7Qs7ahur1fd8aDbW-gA3_ecj38W9Q@mail.gmail.com>
	<CAHwsXYnAniLMqoLnJGWK35xeH3kUa6seoSwsXYDT+YOgmO3r9A@mail.gmail.com>
	<8BC12F8F-4259-4767-BD5F-EB2FC0539090@thelastpickle.com>
	<CAHwsXYn37Qu3nTrctP8Z+ow6pazm1o7fGThJnVpg=w3bBR0fBQ@mail.gmail.com>
Date: Wed, 17 Aug 2011 23:48:26 +0200
Message-ID: 
 <CAHwsXYmJB7ZY8rMQ5zYP1dW4HGOECcWJ0-vkjmGe2ac1gHOESw@mail.gmail.com>
Subject: Re: Unable to repair a node
From: Philippe <watcherfr@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=20cf307ca1acc90a2504aaba7422

--20cf307ca1acc90a2504aaba7422
Content-Type: text/plain; charset=ISO-8859-1

I have a smallish keyspace on my 3 node, RF=3 cluster. My cluster has no
read/write traffic while I am testing repairs. I am running 0.8.4 of debian
packages on ubuntu.

I've know run 7 repairs in a row on this keyspace and every single one has
finished successfully but performed streams between all nodes. This keyspace
was written to over the course of several weeks, sometimes with
CL.write=ALL, CL.read=ONE but lately at QUORUM.
So either I have faulty hardware, a faulty network or something is wrong.
But because repairs on a freshly created 40GB keyspace come up with
consistent ranges, I'm guessing it's neither the hardware or the network.

I could provide the data directories privately to a commiter if that
helps... I assume an eighth repair would also stream stuff around. The data
directories are : 8.3GB, 3.3GB and 3.1GB


Thanks

2011/8/17 Philippe <watcherfr@gmail.com>

> ctrl-c will not stop the repair.
>>
> Ok, so that's  why I've been seeing logs of repairs on other CFs
>
> That's probably the 2280 issue. Data from all CF's is streamed over
>>
> Ah, I get it now.
>
> Thanks
>
>
>
>
>
>>
>> Cheers
>>
>>  -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 17/08/2011, at 10:09 AM, Philippe wrote:
>>
>> One last thought : what happens when you ctrl-c a nodetool repair ? Does
>> it stop the repair on the server ? If not, then I think I have multiple
>> repairs still running. Is there any way to check this ?
>>
>> Thanks
>>
>> 2011/8/16 Philippe <watcherfr@gmail.com>
>>
>>> Even more interesting behavior : a repair on a CF has consequences on
>>> other CFs. I didn't expect that.
>>>
>>> There are no writes being issued to the cluster yet the logs indicate
>>> that
>>>
>>>    - SSTableReader has opened dozens and dozens of files, most of them
>>>    unrelated to the CF being repaired
>>>    - compactions are taking place continuously on CFs other than the one
>>>    being repaired, even CFs in other keyspaces
>>>    - I see "Sending AEService tree" messages for CF not being repaired.
>>>
>>>
>>> After a very long time, I got some AES messages indicating that streaming
>>> from node C had finished and then many minutes after that node B. And yet
>>> the pending stream count on node B hasn't changed.
>>>
>>> The *-data.db files for the CF being repaired are about 70MB on-disk.
>>>
>>> Maybe when a stream is fully received on node B, netstats indicates that
>>> no streams are pending but since they are not acknowledged, node A doesn't ?
>>>
>>>
>>> 2011/8/16 Philippe <watcherfr@gmail.com>
>>>
>>>> I'm still trying different stuff. Here are my latest findings, maybe
>>>> someone will find them useful:
>>>>
>>>>    - I have been able to repair some small column families by issuing a
>>>>    repair [KS] [CF]. When testing on the ring with no writes at all, it still
>>>>    takes about 2 repairs to get "consistent" logs for all AES requests.
>>>>    - Launching a repair one the smallest CF of the biggest KS has
>>>>    triggered a flurry of compactions and streams. Some of those streams are for
>>>>    other CF in that keyspace !?
>>>>    - During repairs (one at a time cluster-wide), I get 25-50% io waits
>>>>    & 35%-50% cpu usage on a 6 core SATA-disk setup
>>>>
>>>> What is surprising to me (bug?) is that netstats shows me streams going
>>>> from node A to node B at 0% progress. But netstats on node B doesn't show me
>>>> any streams coming in. I'm thinking that repairs may be never ending and
>>>> that may be messing up my compactions hence the huge pile up of compactions
>>>> until the disk fulls.
>>>> I know there's an issue related to failed streams & repairs, could I be
>>>> hitting it ?
>>>>
>>>> Thanks
>>>>
>>>> 2011/8/14 Philippe <watcherfr@gmail.com>
>>>>
>>>>> @Teijo : thanks for the procedure, I hope I won't have to do that
>>>>>
>>>>> Peter, I'll answer inline. Thanks for the detailed answer.
>>>>>
>>>>>
>>>>>>  > the number of SSTables for some keyspaces goes dramatically up
>>>>>> (from 3 or 4
>>>>>> > to several dozens).
>>>>>>
>>>>>> Typically with a long running compaction, such as that triggered by
>>>>>> repair, that's what happens as flushed memtables accumulate. In
>>>>>> particular for memtables with frequent flushes.
>>>>>>
>>>>>> Are you running with concurrent compaction enabled?
>>>>>>
>>>>> Yes, it is enabled. On my 0.8 cluster, cassandra.yaml has this (it's
>>>>> commented). BTW, I have 6 cores on each server.
>>>>> #concurrent_compactors: 1
>>>>>
>>>>> > the commit log keeps increasing in size, I'm at 4.3G now, it went up
>>>>>> to 40G
>>>>>> > when the compaction was throttled at 16MB/s. On the other nodes it's
>>>>>> around
>>>>>> > 1GB at most
>>>>>> Hmmmm. The Commit Log should not be retained longer than what is
>>>>>> required for memtables to be flushed. Is it possible you have had an
>>>>>> out-of-disk condition and flushing has stalled? Are you seeing flushes
>>>>>> happening in the log?
>>>>>>
>>>>> No I don't believe there was ever an out of disk.  Yes it is flushing
>>>>> for the first couple of hours.
>>>>> Then, when repair seems locked up, my log is mostly filled with lines
>>>>> such as this
>>>>> INFO [ScheduledTasks:1] 2011-08-14 23:15:47,267 StatusLogger.java (line
>>>>> 88) [My_Keyspace].[My_Columnfamily]           45,105541               50/50
>>>>>               20/20
>>>>>  Why is that ?
>>>>>
>>>>> > the data directory is bigger than on the other nodes. I've seen it go
>>>>>> up to
>>>>>> > 480GB when the compaction was throttled at 16MB/s
>>>>>> How much data are you writing? Is it at all plausible that the huge
>>>>>> spike is a reflection of lots of overwriting writes that aren't being
>>>>>> compacted?
>>>>>>
>>>>> No, there's no bulk loading going on at the moment and I'm pretty sure
>>>>> there wasn't when it spiked up to that load.
>>>>> I've never measured the load because it's a mix of counter increments
>>>>> and new counters all the time. It's not that much though.
>>>>>
>>>>>
>>>>>> Normally when disk space spikes with repair it's due to other nodes
>>>>>> streaming huge amounts (maybe all of their data) to the node, leading
>>>>>> to a temporary spike. But if your "real" size is expected to be 60,
>>>>>> 480 sounds excessive. Are you sure other nodes aren't running repairs
>>>>>> at the same time and magnifying each other's data load spikes?
>>>>>>
>>>>> Yes, the two other nodes were running repairs. I had them scheduled at
>>>>> 8 hour intervals but they must have started.
>>>>> When data is streamed from one to another, does that data go into the
>>>>> commit log as a regular write ?
>>>>>  How much of a negative impact can that have on the repair going on on
>>>>> this node ?
>>>>>
>>>>> > What's even weirder is that currently I have 9 compactions running
>>>>>> but CPU
>>>>>> > is throttled at 1/number of cores half the time (while > 80% the
>>>>>> rest of the
>>>>>> > time). Could this be because other repairs are happening in the ring
>>>>>> ?
>>>>>> You mean compaction is taking less CPU than it "should"?
>>>>>>
>>>>> Yes
>>>>>
>>>>>
>>>>>> No, this should not be due to other nodes repairing. However it sounds
>>>>>> to me like you are bottlenecking on I/O and the repairs and
>>>>>>
>>>>> Yes, I/O is really high on the node right now. Around 50% I/O waits.
>>>>>
>>>>>
>>>>>> compactions are probably proceeding extremely slowly, probably being
>>>>>> completely drowned out by live traffic (which is probably having an
>>>>>> abnormally high performance impact due to data size spike).
>>>>>>
>>>>> Yes, the live traffic is 3 to 10x times slower during repair. Ouch... I
>>>>> hope I won't to do this too often while in production !
>>>>>
>>>>>
>>>>>>
>>>>>> What's your read concurrency configured on the node? What does "iostat
>>>>>> -x -k 1" show in the average queue size column?
>>>>>
>>>>> Average queue size on the disk (RAID-1 + separate LVM volumes for data,
>>>>> commit log, caches, logs)) varies between 2 and 90. I'd say the average is
>>>>> around 30-40. Very high variation.
>>>>>
>>>>>
>>>>>> Is "nodetool -h
>>>>>> localhost tpstats" showing that ReadStage is usually "full" (@ your
>>>>>> limit)?
>>>>>>
>>>>> No backlog at all in tpstats
>>>>>
>>>>> I've figured out how AES is logging its actions and it looks like it
>>>>> really is going through every CF in every keyspace and doing a tree request
>>>>> for every token range
>>>>> So it really looks like it's just taking forever to compact stuff as
>>>>> it's repairing.
>>>>> I saw in another email that repairing was taking 2-3mn/ GB... it looks
>>>>> like a lot more for my ring. Anybody else have numbers ?
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>>
>>>
>>
>>
>

--20cf307ca1acc90a2504aaba7422
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I have a smallish keyspace on my 3 node, RF=3D3 cluster. My cluster has no =
read/write traffic while I am testing repairs. I am running 0.8.4 of debian=
 packages on ubuntu.<div><br></div><div>I&#39;ve know run 7 repairs in a ro=
w on this keyspace and every single one has finished successfully but perfo=
rmed streams between all nodes. This keyspace was written to over the cours=
e of several weeks, sometimes with CL.write=3DALL, CL.read=3DONE but lately=
 at QUORUM.=A0</div>
<div>So either I have faulty hardware, a faulty network or something is wro=
ng. But because repairs on a freshly created 40GB keyspace come up with con=
sistent ranges, I&#39;m guessing it&#39;s neither the hardware or the netwo=
rk.</div>
<div><br></div><div>I could provide the data directories privately to a com=
miter if that helps... I assume an eighth repair would also stream stuff ar=
ound. The data directories are : 8.3GB, 3.3GB and 3.1GB</div><div>=A0</div>
<div><br></div><div>Thanks</div><div><br><div class=3D"gmail_quote">2011/8/=
17 Philippe <span dir=3D"ltr">&lt;<a href=3D"mailto:watcherfr@gmail.com">wa=
tcherfr@gmail.com</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class=3D"gmail_quote"><div class=3D"im"><blockquote class=3D"gmail_quo=
te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"=
><div style=3D"word-wrap:break-word">ctrl-c will not stop the repair.=A0</d=
iv></blockquote>
</div><div>Ok, so that&#39;s =A0why I&#39;ve been seeing logs of repairs on=
 other CFs=A0</div><div class=3D"im">
<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex"><div style=3D"word-wrap:break=
-word"><div>That&#39;s probably the 2280 issue. Data from all CF&#39;s is s=
treamed over</div>

</div></blockquote></div><div>Ah, I get it now.</div><div><br></div><div>Th=
anks</div><div><div></div><div class=3D"h5"><div><br></div><div><br></div><=
div>=A0</div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div style=3D"word-wrap:break-word"><div></div><div><br></div><div>Cheers</=
div><div><br><div>
<span style=3D"border-collapse:separate;color:rgb(0, 0, 0);font-family:Helv=
etica;font-style:normal;font-variant:normal;font-weight:normal;letter-spaci=
ng:normal;line-height:normal;text-align:auto;text-indent:0px;text-transform=
:none;white-space:normal;word-spacing:0px;font-size:medium"><span style=3D"=
border-collapse:separate;color:rgb(0, 0, 0);font-family:Helvetica;font-styl=
e:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-=
height:normal;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px;font-size:medium"><div style=3D"word-wrap:break-word">

<span style=3D"border-collapse:separate;color:rgb(0, 0, 0);font-family:Helv=
etica;font-style:normal;font-variant:normal;font-weight:normal;letter-spaci=
ng:normal;line-height:normal;text-indent:0px;text-transform:none;white-spac=
e:normal;word-spacing:0px;font-size:medium"><div style=3D"word-wrap:break-w=
ord">

<div><div>-----------------</div><div>Aaron Morton</div><font color=3D"#888=
888"><div>Freelance Cassandra Developer</div><div>@aaronmorton</div><div><a=
 href=3D"http://www.thelastpickle.com" target=3D"_blank">http://www.thelast=
pickle.com</a></div>

</font></div></div></span></div></span></span>
</div><div><div></div><div>

<br><div><div>On 17/08/2011, at 10:09 AM, Philippe wrote:</div><br><blockqu=
ote type=3D"cite">One last thought : what happens when you ctrl-c a nodetoo=
l repair ? Does it stop the repair on the server ? If not, then I think I h=
ave multiple repairs still running. Is there any way to check this ?<div>

<br></div><div>Thanks<br>
<br><div class=3D"gmail_quote">2011/8/16 Philippe <span dir=3D"ltr">&lt;<a =
href=3D"mailto:watcherfr@gmail.com" target=3D"_blank">watcherfr@gmail.com</=
a>&gt;</span><br><blockquote class=3D"gmail_quote" style=3D"margin-top:0px;=
margin-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;=
border-left-color:rgb(204, 204, 204);border-left-style:solid;padding-left:1=
ex">


Even more interesting behavior : a repair on a CF has consequences on other=
 CFs. I didn&#39;t expect that.<div><br></div><div>There are no writes bein=
g issued to the cluster yet the logs indicate that=A0</div><div><ul><li>

SSTableReader has opened dozens and dozens of files, most of them unrelated=
 to the CF being repaired</li>

<li>compactions are taking place continuously on CFs other than the one bei=
ng repaired, even CFs in other keyspaces</li><li><div>I see &quot;Sending A=
EService tree&quot; messages for CF not being repaired.</div><div><br>


</div>
</li></ul>After a very long time, I got some AES messages indicating that s=
treaming from node C had finished and then many minutes after that node B. =
And yet the pending stream count on node B hasn&#39;t changed.</div><div>


<br></div><div><div>The *-data.db files for the CF being repaired are about=
 70MB on-disk.<br><div><br></div><div>Maybe when a stream is fully received=
 on node B, netstats indicates that no streams are pending but since they a=
re not acknowledged, node A doesn&#39;t ?</div>


<div><div></div><div>
<div><br></div><div><br></div><div><div class=3D"gmail_quote">2011/8/16 Phi=
lippe <span dir=3D"ltr">&lt;<a href=3D"mailto:watcherfr@gmail.com" target=
=3D"_blank">watcherfr@gmail.com</a>&gt;</span><br><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">


I&#39;m still trying different stuff. Here are my latest findings, maybe so=
meone will find them useful:<div><ul><li>I have been able to repair some sm=
all column families by issuing a repair [KS] [CF]. When testing on the ring=
 with no writes at all, it still takes about 2 repairs to get &quot;consist=
ent&quot; logs for all AES requests.</li>


<li>Launching a repair one the smallest CF of the biggest KS has triggered =
a flurry of compactions and streams. Some of those streams are for other CF=
 in that keyspace !?</li><li>During repairs (one at a time cluster-wide), I=
 get 25-50% io waits &amp; 35%-50% cpu usage on a 6 core SATA-disk setup</l=
i>


</ul><div>What is surprising to me (bug?) is that netstats shows me streams=
 going from node A to node B at 0% progress. But netstats on node B doesn&#=
39;t show me any streams coming in. I&#39;m thinking that repairs may be ne=
ver ending and that may be messing up my compactions hence the huge pile up=
 of compactions until the disk fulls.</div>


<div>I know there&#39;s an issue related to failed streams &amp; repairs, c=
ould I be hitting it ?</div><div><br></div><div>Thanks</div><br><div class=
=3D"gmail_quote"><div>2011/8/14 Philippe <span dir=3D"ltr">&lt;<a href=3D"m=
ailto:watcherfr@gmail.com" target=3D"_blank">watcherfr@gmail.com</a>&gt;</s=
pan><br>


</div><div><div></div><div><blockquote class=3D"gmail_quote" style=3D"margi=
n:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">@<span style=3D"f=
ont-family:arial, sans-serif;font-size:13px;border-collapse:collapse">Teijo=
 : thanks for the procedure, I hope I won&#39;t have to do that</span><br>


<br><div>Peter, I&#39;ll answer inline. Thanks for the detailed answer.<br>=
<div class=3D"gmail_quote"><div><div>=A0</div><blockquote class=3D"gmail_qu=
ote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex=
">
<div>

&gt; the number of SSTables for some keyspaces goes dramatically up (from 3=
 or 4<br>
&gt; to several dozens).<br>
<br>
</div>Typically with a long running compaction, such as that triggered by<b=
r>
repair, that&#39;s what happens as flushed memtables accumulate. In<br>
particular for memtables with frequent flushes.<br>
<br>
Are you running with concurrent compaction enabled?<br></blockquote></div><=
div>Yes, it is enabled. On my 0.8 cluster, cassandra.yaml has this (it&#39;=
s commented). BTW, I have 6 cores on each server.</div><div><div style=3D"m=
argin-top:0px;margin-right:0px;margin-bottom:0px;margin-left:0px">


#concurrent_compactors: 1</div></div><div><div><br></div><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">
<div>&gt; the commit log keeps increasing in size, I&#39;m at 4.3G now, it =
went up to 40G<br>
&gt; when the compaction was throttled at 16MB/s. On the other nodes it&#39=
;s around<br>
&gt; 1GB at most<br>Hmmmm. The Commit Log should not be retained longer tha=
n what is</div>
required for memtables to be flushed. Is it possible you have had an<br>
out-of-disk condition and flushing has stalled? Are you seeing flushes<br>
happening in the log?<br></blockquote></div><div>No I don&#39;t believe the=
re was ever an out of disk. =A0Yes it is flushing for the first couple of h=
ours.</div><div>Then, when repair seems locked up, my log is mostly filled =
with lines such as this</div>


<div><font face=3D"&#39;courier new&#39;, monospace">INFO [ScheduledTasks:1=
] 2011-08-14 23:15:47,267 StatusLogger.java (line 88) [My_Keyspace].[My_Col=
umnfamily] =A0 =A0 =A0 =A0 =A0 45,105541 =A0 =A0 =A0 =A0 =A0 =A0 =A0 50/50 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 20/20</font></div>


<div>=A0Why is that ?</div><div><div><br></div><blockquote class=3D"gmail_q=
uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e=
x">
<div>&gt; the data directory is bigger than on the other nodes. I&#39;ve se=
en it go up to<br>
&gt; 480GB=A0when the compaction was throttled at 16MB/s<br>How much data a=
re you writing? Is it at all plausible that the huge</div>
spike is a reflection of lots of overwriting writes that aren&#39;t being<b=
r>
compacted?<br></blockquote></div><div>No, there&#39;s no bulk loading going=
 on at the moment and I&#39;m pretty sure there wasn&#39;t when it spiked u=
p to that load.</div><div>I&#39;ve never measured the load because it&#39;s=
 a mix of counter increments and new counters all the time. It&#39;s not th=
at much though.</div>


<div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex">Normally when disk space spike=
s with repair it&#39;s due to other nodes<br>
streaming huge amounts (maybe all of their data) to the node, leading<br>
to a temporary spike. But if your &quot;real&quot; size is expected to be 6=
0,<br>
480 sounds excessive. Are you sure other nodes aren&#39;t running repairs<b=
r>
at the same time and magnifying each other&#39;s data load spikes?<br></blo=
ckquote></div><div>Yes, the two other nodes were running repairs. I had the=
m scheduled at 8 hour intervals but they must have started.</div><div>


When data is streamed from one to another, does that data go into the commi=
t log as a regular write ?</div>
<div>=A0How much of a negative impact can that have on the repair going on =
on this node ?</div><div><div><br></div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div>&gt; What&#39;s even weirder is that currently I have 9 compactions ru=
nning but CPU<br>
&gt; is throttled at 1/number of cores half the time (while &gt; 80% the re=
st of the<br>
&gt; time). Could this be because other repairs are happening in the ring ?=
<br>You mean compaction is taking less CPU than it &quot;should&quot;?</div=
></blockquote></div><div>Yes</div><div><div>=A0</div><blockquote class=3D"g=
mail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-l=
eft:1ex">


No, this should not be due to other nodes repairing. However it sounds<br>
to me like you are bottlenecking on I/O and the repairs and<br></blockquote=
></div><div>Yes, I/O is really high on the node right now. Around 50% I/O w=
aits.</div><div><div>=A0</div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


compactions are probably proceeding extremely slowly, probably being<br>
completely drowned out by live traffic (which is probably having an<br>
abnormally high performance impact due to data size spike).<br></blockquote=
></div><div>Yes, the live traffic is 3 to 10x times slower during repair. O=
uch... I hope I won&#39;t to do this too often while in production !</div>


<div><div>
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">
<br>
What&#39;s your read concurrency configured on the node? What does &quot;io=
stat<br>
-x -k 1&quot; show in the average queue size column? </blockquote></div><di=
v>Average queue size on the disk (RAID-1 + separate LVM volumes for data, c=
ommit log, caches, logs)) varies between 2 and 90. I&#39;d say the average =
is around 30-40. Very high variation.</div>


<div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex">Is &quot;nodetool -h<br>
localhost tpstats&quot; showing that ReadStage is usually &quot;full&quot; =
(@ your<br>
limit)?<br></blockquote></div><div>No backlog at all in tpstats</div><div>=
=A0</div><div>I&#39;ve figured out how AES is logging its actions and it lo=
oks like it really is going through every CF in every keyspace and doing a =
tree request for every token range</div>


<div>So it really looks like it&#39;s just taking forever to compact stuff =
as it&#39;s repairing.=A0</div><div>I saw in another email that repairing w=
as taking 2-3mn/ GB... it looks like a lot more for my ring. Anybody else h=
ave numbers ?</div>


<div><br></div><div>Thanks</div></div></div>
</blockquote></div></div></div><br></div>
</blockquote></div><br></div></div></div></div></div>
</blockquote></div><br></div>
</blockquote></div><br></div></div></div></div></blockquote></div></div></d=
iv><br>
</blockquote></div><br></div>

--20cf307ca1acc90a2504aaba7422--