Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of philburresseme@gmail.com
 designates 209.85.216.44 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CA+z1Z6f6eZJ9Vrg4UGpsa1V0BNj1GJjoxi7UJ6UDSNrPLF6ixQ@mail.gmail.com>
References: 
 <CADT6p3kDR027EpLhb8SGk-qg6b32jghrVFKfWc6BPBxO-TzBAQ@mail.gmail.com>
	<CAGM0Up8Y7wUos4tVniUmOS-s_LV5Oxta3pE+_Kp6pNehg23nQw@mail.gmail.com>
	<CAEDUwd26dW6YEaaijGQHdJ4fKhSQaHDQp6k1obxrvqQrctxxvQ@mail.gmail.com>
	<CABzeAR4+qSNyLwiBvV0nW8V4cmQ7WwNB_wGzEJGGYq=KqTA8PQ@mail.gmail.com>
	<CADT6p3mFwF6kZaOWh=u7Sq1oziDbjmNrPQPWW1Rhit9QZARokg@mail.gmail.com>
	<CADT6p3nK7AemHrnAJPGmjKrzRXXQuhwxrFFNmeJqZdbvUUJe-w@mail.gmail.com>
	<CAM+WaZjH5vNdpTSir6AA7--Lw4Z_kyp-VhS4spb-Mkq0Ggi2gg@mail.gmail.com>
	<CADT6p3kSrSdqsh7JgyaCxXceLfKOU_nj=Ro+kvC7tVoaDTRA7Q@mail.gmail.com>
	<CA+z1Z6f6eZJ9Vrg4UGpsa1V0BNj1GJjoxi7UJ6UDSNrPLF6ixQ@mail.gmail.com>
Date: Tue, 1 Jul 2014 18:53:55 -0400
Message-ID: 
 <CADT6p3kRXBKEWihM5-+D-koXH=vZmdczj7jqYG4WT9jPGP24rQ@mail.gmail.com>
Subject: Re: nodetool repair -snapshot option?
From: Phil Burress <philburresseme@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c146e484437304fd29a6ce

--001a11c146e484437304fd29a6ce
Content-Type: text/plain; charset=UTF-8

Thanks! We retrieved all the ranges and started running repair on them. We
ran through all of them but found one single range which brought the ENTIRE
cluster down. All of the other ranges ran quickly and smoothly. This one
problematic range reliably brings it down every time we try to run repair
on it. Any thoughts on why one specific range would be a troublemaker?


On Tue, Jul 1, 2014 at 11:44 AM, Ken Hancock <ken.hancock@schange.com>
wrote:

> I also expanded on a script originally written by Matt Stump @ Datastax.
> The readme has the reasoning behind requiring sub-range repairs.
>
> https://github.com/hancockks/cassandra_range_repair
>
>
>
>
> On Mon, Jun 30, 2014 at 10:20 PM, Phil Burress <philburresseme@gmail.com>
> wrote:
>
>> @Paulo, this is very cool! Thanks very much for the link!
>>
>>
>> On Mon, Jun 30, 2014 at 9:37 PM, Paulo Ricardo Motta Gomes <
>> paulo.motta@chaordicsystems.com> wrote:
>>
>>> If you find it useful, I created a tool where you input the node IP,
>>> keyspace, column family, and optionally the number of partitions (default:
>>> 32K), and it outputs the list of subranges for that node, CF, partition
>>> size: https://github.com/pauloricardomg/cassandra-list-subranges
>>>
>>> So you can basically iterate over the output of that and do subrange
>>> repair for each node and cf, maybe in parallel. :)
>>>
>>>
>>> On Mon, Jun 30, 2014 at 10:26 PM, Phil Burress <philburresseme@gmail.com
>>> > wrote:
>>>
>>>> One last question. Any tips on scripting a subrange repair?
>>>>
>>>>
>>>> On Mon, Jun 30, 2014 at 7:12 PM, Phil Burress <philburresseme@gmail.com
>>>> > wrote:
>>>>
>>>>> We are running repair -pr. We've tried subrange manually and that
>>>>> seems to work ok. I guess we'll go with that going forward. Thanks for all
>>>>> the info!
>>>>>
>>>>>
>>>>> On Mon, Jun 30, 2014 at 6:52 PM, Jaydeep Chovatia <
>>>>> chovatia.jaydeep@gmail.com> wrote:
>>>>>
>>>>>> Are you running full repair or on subset? If you are running full
>>>>>> repair then try running on sub-set of ranges which means less data to worry
>>>>>> during repair and that would help JAVA heap in general. You will have to do
>>>>>> multiple iterations to complete entire range but at-least it will work.
>>>>>>
>>>>>> -jaydeep
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 30, 2014 at 3:22 PM, Robert Coli <rcoli@eventbrite.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Mon, Jun 30, 2014 at 3:08 PM, Yuki Morishita <mor.yuki@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Repair uses snapshot option by default since 2.0.2 (see NEWS.txt).
>>>>>>>>
>>>>>>>
>>>>>>> As a general meta comment, the process by which operationally
>>>>>>> important defaults change in Cassandra seems ad-hoc and sub-optimal.
>>>>>>>
>>>>>>> For to record, my view was that this change, which makes repair even
>>>>>>> slower than it previously was, was probably overly optimistic.
>>>>>>>
>>>>>>> It's also weird in that it changes default behavior which has been
>>>>>>> unchanged since the start of Cassandra time and is therefore probably
>>>>>>> automated against. Why was it so critically important to switch to snapshot
>>>>>>> repair that it needed to be shotgunned as a new default in 2.0.2?
>>>>>>>
>>>>>>> =Rob
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Paulo Motta*
>>>
>>> Chaordic | *Platform*
>>> *www.chaordic.com.br <http://www.chaordic.com.br/>*
>>> +55 48 3232.3200
>>>
>>
>>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hancock@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
> LinkedIn] <http://www.linkedin.com/in/kenhancock>
>
> [image: SeaChange International]
>  <http://www.schange.com/>This e-mail and any attachments may contain
> information which is SeaChange International confidential. The information
> enclosed is intended only for the addressees herein and may not be copied
> or forwarded without permission from SeaChange International.
>

--001a11c146e484437304fd29a6ce
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks! We retrieved all the ranges and started running re=
pair on them. We ran through all of them but found one single range which b=
rought the ENTIRE cluster down. All of the other ranges ran quickly and smo=
othly. This one problematic range reliably brings it down every time we try=
 to run repair on it. Any thoughts on why one specific range would be a tro=
ublemaker?</div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Tue, Jul 1=
, 2014 at 11:44 AM, Ken Hancock <span dir=3D"ltr">&lt;<a href=3D"mailto:ken=
.hancock@schange.com" target=3D"_blank">ken.hancock@schange.com</a>&gt;</sp=
an> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">I also expanded on a script=
 originally written by Matt Stump @ Datastax. The readme has the reasoning =
behind requiring sub-range repairs.<br>
<br><a href=3D"https://github.com/hancockks/cassandra_range_repair" target=
=3D"_blank">https://github.com/hancockks/cassandra_range_repair</a><br>

<br><br></div><div class=3D"gmail_extra"><div><div class=3D"h5"><br><br><di=
v class=3D"gmail_quote">On Mon, Jun 30, 2014 at 10:20 PM, Phil Burress <spa=
n dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com" target=3D"_bl=
ank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">@<span style=3D"font-family=
:arial,sans-serif;font-size:13.333333969116211px;font-weight:bold;white-spa=
ce:nowrap">Paulo</span><span style=3D"font-family:arial,sans-serif;font-siz=
e:13.333333969116211px;white-space:nowrap">, this is very cool! Thanks very=
 much for the link!</span></div>


<div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Mon, Jun 3=
0, 2014 at 9:37 PM, Paulo Ricardo Motta Gomes <span dir=3D"ltr">&lt;<a href=
=3D"mailto:paulo.motta@chaordicsystems.com" target=3D"_blank">paulo.motta@c=
haordicsystems.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>If you find it useful,=
 I created a tool where you input the node IP, keyspace, column family, and=
 optionally the number of partitions (default: 32K), and it outputs the lis=
t of subranges for that node, CF, partition size:=C2=A0<a href=3D"https://g=
ithub.com/pauloricardomg/cassandra-list-subranges" target=3D"_blank">https:=
//github.com/pauloricardomg/cassandra-list-subranges</a></div>


<div><br>So you can basically iterate over the output of that and do subran=
ge repair for each node and cf, maybe in parallel. :)</div></div><div class=
=3D"gmail_extra"><div><div><br><br><div class=3D"gmail_quote">On Mon, Jun 3=
0, 2014 at 10:26 PM, Phil Burress <span dir=3D"ltr">&lt;<a href=3D"mailto:p=
hilburresseme@gmail.com" target=3D"_blank">philburresseme@gmail.com</a>&gt;=
</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">One last question. Any tips=
 on scripting a subrange repair?</div><div><div><div class=3D"gmail_extra">

<br><br><div class=3D"gmail_quote">On Mon, Jun 30, 2014 at 7:12 PM, Phil Bu=
rress <span dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com" tar=
get=3D"_blank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">We are running repair -pr. =
We&#39;ve tried subrange manually and that seems to work ok. I guess we&#39=
;ll go with that going forward. Thanks for all the info!</div>


<div><div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">
On Mon, Jun 30, 2014 at 6:52 PM, Jaydeep Chovatia <span dir=3D"ltr">&lt;<a =
href=3D"mailto:chovatia.jaydeep@gmail.com" target=3D"_blank">chovatia.jayde=
ep@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" sty=
le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div dir=3D"ltr">Are you running full repair or on subset? If you are runni=
ng full repair then try running on sub-set of ranges which means less data =
to worry during repair and that would help JAVA heap in general. You will h=
ave to do multiple iterations to complete entire range but at-least it will=
 work.<span><font color=3D"#888888"><div>


<br></div><div>-jaydeep</div></font></span></div><div><div><div class=3D"gm=
ail_extra"><br><br><div class=3D"gmail_quote">On Mon, Jun 30, 2014 at 3:22 =
PM, Robert Coli <span dir=3D"ltr">&lt;<a href=3D"mailto:rcoli@eventbrite.co=
m" target=3D"_blank">rcoli@eventbrite.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra">=
<div class=3D"gmail_quote"><div>On Mon, Jun 30, 2014 at 3:08 PM, Yuki Moris=
hita <span dir=3D"ltr">&lt;<a href=3D"mailto:mor.yuki@gmail.com" target=3D"=
_blank">mor.yuki@gmail.com</a>&gt;</span> wrote:<br>


</div><div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.=
8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-st=
yle:solid;padding-left:1ex">Repair uses snapshot option by default since 2.=
0.2 (see NEWS.txt).<br>


</blockquote><div><br></div></div><div><div>As a general meta comment, the =
process by which operationally important defaults change in Cassandra seems=
 ad-hoc and sub-optimal.</div><div><br></div></div><div>For to record, my v=
iew was that this change, which makes repair even slower than it previously=
 was, was probably overly optimistic.<br>


</div><div><br></div><div>It&#39;s also weird in that it changes default be=
havior which has been unchanged since the start of Cassandra time and is th=
erefore probably automated against. Why was it so critically important to s=
witch to snapshot repair that it needed to be shotgunned as a new default i=
n 2.0.2?</div>


<div><br></div><div>=3DRob<br></div><div>=C2=A0</div></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div></div><=
/div><span><font color=3D"#888888">-- <br><div dir=3D"ltr"><div style=3D"ba=
ckground-color:rgb(255,255,255)"><b>Paulo Motta</b></div><div style=3D"back=
ground-color:rgb(255,255,255)">


<br></div><div style=3D"font-family:arial,sans-serif;font-size:12.727272033=
691406px;background-color:rgb(255,255,255)">

<div style=3D"color:rgb(136,136,136);font-size:small;font-family:arial"><sp=
an style=3D"color:rgb(68,68,68)">Chaordic | <i>Platform</i></span><br></div=
><div style=3D"color:rgb(136,136,136);font-size:small;font-family:arial"><u=
><a href=3D"http://www.chaordic.com.br/" style=3D"color:rgb(17,85,204)" tar=
get=3D"_blank"><font color=3D"#444444">www.chaordic.com.br</font></a></u></=
div>


<div style=3D"color:rgb(136,136,136);font-size:small;font-family:arial"><fo=
nt color=3D"#666666" size=3D"1"><a href=3D"tel:%2B55%2048%203232.3200" valu=
e=3D"+554832323200" target=3D"_blank">+55 48 3232.3200</a></font></div></di=
v></div>


</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><br></div></div><span =
class=3D"HOEnZb"><font color=3D"#888888">-- <br><table style=3D"color:rgb(0=
,0,0);font-family:Verdana,Arial,Helvetica,sans-serif;font-size:11px" border=
=3D"0" cellpadding=3D"0" cellspacing=3D"0" height=3D"100" width=3D"500">


<tbody><tr><td style=3D"font-family:Arial,Helvetica,sans-serif;font-size:3m=
m;line-height:4mm;color:rgb(51,51,51)" bgcolor=3D"#FFFFFF" valign=3D"middle=
"><b><b><font color=3D"#000066">Ken Hancock=C2=A0</font></b></b>| System Ar=
chitect, Advanced Advertising=C2=A0<br>


<span style=3D"color:rgb(0,0,102);font-weight:bold">SeaChange International=
=C2=A0</span><br>50 Nagog Park<br>Acton, Massachusetts 01720<br><a href=3D"=
mailto:ken.hancock@schange.com" target=3D"_blank">ken.hancock@schange.com</=
a>=C2=A0|=C2=A0<a href=3D"http://www.schange.com/" target=3D"_blank">www.sc=
hange.com</a>=C2=A0| NASDAQ:<a href=3D"http://www.schange.com/en-US/Company=
/InvestorRelations.aspx" target=3D"_blank">SEAC</a>=C2=A0<br>


Office: <a href=3D"tel:%2B1%20%28978%29%20889-3329" value=3D"+19788893329" =
target=3D"_blank">+1 (978) 889-3329</a>=C2=A0|=C2=A0<img src=3D"https://s3.=
amazonaws.com/images.wisestamp.com/gtalk.png" alt=3D"Google Talk:">=C2=A0<a=
 href=3D"mailto:ken.hancock@schange.com" target=3D"_blank">ken.hancock@scha=
nge.com</a>=C2=A0|=C2=A0<img src=3D"https://s3.amazonaws.com/images.wisesta=
mp.com/skype.png" alt=3D"Skype:">hancockks=C2=A0|=C2=A0<img src=3D"https://=
s3.amazonaws.com/images.wisestamp.com/yahoo.png" alt=3D"Yahoo IM:">hancockk=
s</td>


<td valign=3D"top"><a href=3D"http://www.linkedin.com/in/kenhancock" target=
=3D"_blank"><img src=3D"https://s3.amazonaws.com/images.wisestamp.com/linke=
din.png" alt=3D"LinkedIn"></a><br></td></tr><tr><td valign=3D"middle"><a hr=
ef=3D"http://www.schange.com/" target=3D"_blank"><br>


<img src=3D"http://www.schange.com/Images/Emails/Signatures/SC_email_Sig_11=
_2010" alt=3D"SeaChange International" longdesc=3D"https://ci6.googleuserco=
ntent.com/proxy/nRFOVzF53ZsHzHwnNNmyAat5jrT1RHdvpfzYlJkT8Pl1aO8eO2BOzATSRRj=
8sFXSXE-WhVVD6bS8Gq1ZQ8zYMZcH0LisLSC6hE8x0v9yGoNRt07KK7ZTaoHu=3Ds0-d-e1-ft#=
http://www.schange.com/Images/Emails/Signatures/SC_email_Sig_11_2010" borde=
r=3D"0" height=3D"62" hspace=3D"0" vspace=3D"0" width=3D"348"><br>


</a></td></tr><tr><td style=3D"font-family:Arial,Helvetica,sans-serif;font-=
size:3mm;line-height:4mm;color:rgb(51,51,51)" align=3D"left" valign=3D"top"=
>This e-mail and any attachments may contain information which is SeaChange=
 International confidential. The information enclosed is intended only for =
the addressees herein and may not be copied or forwarded without permission=
 from SeaChange International.</td>


</tr></tbody></table>
</font></span></div>
</blockquote></div><br></div>

--001a11c146e484437304fd29a6ce--