Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of srobenal@stanford.edu
 designates 171.67.219.81 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CADT6p3k5tBFKQhppfMHfDydw3gZu=Yt6pjaibRP0d9srHeWxnw@mail.gmail.com>
References: 
 <CADT6p3k12Q-iNE=4J91r0MOOQ36yiHFfpziSHo9ctuwUa0Rj+g@mail.gmail.com>
	<CAEDUwd0GRkGbqJjAss=zXfEd4k36+Akv7ex04zgdgp+C7-hQSg@mail.gmail.com>
	<CADT6p3k0JJ26iZMNNoBDLtwAYx--CT6U6aEuyd3E=3fAt_drYw@mail.gmail.com>
	<CAEDUwd1DBTb6p6ustAwgkTZd-38oy+LpgZ3iCZerJd_P26tzjg@mail.gmail.com>
	<CADT6p3kh6v15eDRBzhh4aY8mk3JO2zd4m9x6WDH167Ec9H3KdQ@mail.gmail.com>
	<CADT6p3=Ohnh93F=KAhjqCPHsZ5qvKGK3y-mAMf+OMqaaQrR23w@mail.gmail.com>
	<CAEDUwd2itR_D909OQmBja0-j=DjFA7WuE7EpwY8CsiGAbGHs0Q@mail.gmail.com>
	<CADT6p3mTP3As4YzR_BQ-KiVq_816pOWXtEqPE-dfc+b6h_O=1A@mail.gmail.com>
	<CAEDUwd3yhinyvH-Mw=ZKAf6Z8Zoh20U71SSihyjcAPrXYpLfjw@mail.gmail.com>
	<CAGZhckZ5H-n03KOpVoAJ4TExr0Fu6Laju9TXdHcD=fgLoMMq-w@mail.gmail.com>
	<CADT6p3kEx+L_gi44aJgPp93FrHvQAkVty+ZoOCYm+7phho1X-Q@mail.gmail.com>
	<CAGZhckbYb1qiJqAT7-68kB=Aa0E_qNEW0a8w3AktQR_Fcexa5g@mail.gmail.com>
	<CADT6p3=b6kHRErR=zOO9=RbZwviHmCYC-ngF3m-ruiKAZ4Od4g@mail.gmail.com>
	<CADT6p3mfsBzqE04z+bfsUnrBKiosHApYXoV-qOMhzx8iL1cAPQ@mail.gmail.com>
	<CADT6p3=qrFY79M4yisr+Z=ouhbNOjjZ=eQFXCbg83uTO16pREw@mail.gmail.com>
	<CADR+3ZEwzNPV0MHucz3sxk4Lnt=ZYk+_sEMz+_oBbac6tCWivw@mail.gmail.com>
	<CADT6p3k5tBFKQhppfMHfDydw3gZu=Yt6pjaibRP0d9srHeWxnw@mail.gmail.com>
Date: Fri, 25 Apr 2014 08:38:02 -0700
Message-ID: 
 <CAGZhckbRH_E91XL5vvvrR-2GEH4FXJX14dT1mxKm3m_mQFMToQ@mail.gmail.com>
Subject: Re: Bootstrap Timing
From: Steven A Robenalt <srobenal@stanford.edu>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a113311fc4f0bb504f7dfc02a

--001a113311fc4f0bb504f7dfc02a
Content-Type: text/plain; charset=ISO-8859-1

Interesting. I did our 2.0.3 -> 2.0.5 upgrade by bootstrapping/joining each
node into our cluster, one at a time, then retiring the old nodes one at a
time. Maybe something specific to the 2.0.6 release?

Good to hear that you've gotten through it anyway.

Steve


On Fri, Apr 25, 2014 at 7:49 AM, Phil Burress <philburresseme@gmail.com>wrote:

> Cassandra 2.0.6
>
>
> On Fri, Apr 25, 2014 at 10:31 AM, James Rothering <jrothering@codojo.me>wrote:
>
>> What version of C* is this?
>>
>>
>> On Fri, Apr 25, 2014 at 6:55 AM, Phil Burress <philburresseme@gmail.com>wrote:
>>
>>> Just a follow-up on this for any interested parties. Ultimately we've
>>> determined that the bootstrap/join process is broken in Cassandra. We ended
>>> up creating an entirely new cluster and migrating the data.
>>>
>>>
>>> On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress <philburresseme@gmail.com
>>> > wrote:
>>>
>>>> The new node has managed to stay up without dying for about 24 hours
>>>> now... but it still is in JOINING state. A new concern has popped up. Disk
>>>> usage is at 500GB on the new node. The three original nodes have about 40GB
>>>> each. Any ideas why this is happening?
>>>>
>>>>
>>>> On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <philburresseme@gmail.com
>>>> > wrote:
>>>>
>>>>> Thank you all for your advice and good info. The node has died a
>>>>> couple of times with out of memory errors. I've restarted each time but it
>>>>> starts re - running compaction and then dies again.
>>>>>
>>>>> Is there a better way to do this?
>>>>> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <srobenal@stanford.edu>
>>>>> wrote:
>>>>>
>>>>>> That's what I'd be doing, but I wouldn't expect it to run for 3 days
>>>>>> this time. My guess is that whatever was going wrong with the bootstrap
>>>>>> when you had 3 nodes starting at once was interfering with the completion
>>>>>> of the 1 remaining node of those 3. A clean bootstrap of a single node
>>>>>> should complete eventually, and I would think it'll be a lot less than 3
>>>>>> days. Our database is much smaller than yours at the moment, so I can't
>>>>>> really guide you on how long it should take, but I'd think that others on
>>>>>> the list with similar database sizes might be able to give you a better
>>>>>> idea.
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <
>>>>>> philburresseme@gmail.com> wrote:
>>>>>>
>>>>>>> First, I just stopped 2 of the nodes and left one running. But this
>>>>>>> morning, I stopped that third node, cleared out the data, restarted and let
>>>>>>> it rejoin again. It appears streaming is done (according to netstats),
>>>>>>> right now it appears to be running compaction and building secondary index
>>>>>>> (according to compactionstats). Just sit and wait I guess?
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>>>>>> srobenal@stanford.edu> wrote:
>>>>>>>
>>>>>>>> Looking back through this email chain, it looks like Phil said he
>>>>>>>> wasn't using vnodes.
>>>>>>>>
>>>>>>>> For the record, we are using vnodes since we brought up our first
>>>>>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>>>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>>>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>>>>>> allow each node to join the cluster completely before making any other
>>>>>>>> changes.
>>>>>>>>
>>>>>>>> Phil, when you dropped to adding just the single node to your
>>>>>>>> cluster, did you start over with the newly added node (blowing away the
>>>>>>>> database created on the previous startup), or did you shut down the other 2
>>>>>>>> added nodes and leave the remaining one in progress to continue?
>>>>>>>>
>>>>>>>> Steve
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rcoli@eventbrite.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>>>>>> philburresseme@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>>>>>
>>>>>>>>>> I'm assuming this means it's done. But it still shows "JOINING".
>>>>>>>>>> Is there an undocumented step I'm missing here? This whole process seems
>>>>>>>>>> broken to me.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Lately it seems like a lot more people than usual are :
>>>>>>>>>
>>>>>>>>> 1) using vnodes
>>>>>>>>> 2) unable to bootstrap new nodes
>>>>>>>>>
>>>>>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>>>>>> experience with this core functionality.
>>>>>>>>>
>>>>>>>>> =Rob
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Steve Robenalt
>>>>>>>> Software Architect
>>>>>>>>  HighWire | Stanford University
>>>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>>>
>>>>>>>> srobenal@stanford.edu
>>>>>>>> http://highwire.stanford.edu
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Steve Robenalt
>>>>>> Software Architect
>>>>>> HighWire | Stanford University
>>>>>> 425 Broadway St, Redwood City, CA 94063
>>>>>>
>>>>>> srobenal@stanford.edu
>>>>>> http://highwire.stanford.edu
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>


-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobenal@stanford.edu
http://highwire.stanford.edu

--001a113311fc4f0bb504f7dfc02a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Interesting. I did our 2.0.3 -&gt; 2.0.5 upgrade by bootst=
rapping/joining each node into our cluster, one at a time, then retiring th=
e old nodes one at a time. Maybe something specific to the 2.0.6 release?<d=
iv>
<br></div><div>Good to hear that you&#39;ve gotten through it anyway.</div>=
<div><br></div><div>Steve</div><div><br></div></div><div class=3D"gmail_ext=
ra"><br><br><div class=3D"gmail_quote">On Fri, Apr 25, 2014 at 7:49 AM, Phi=
l Burress <span dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com"=
 target=3D"_blank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Cassandra 2.0.6</div><div c=
lass=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_extra"><br><br><div c=
lass=3D"gmail_quote">
On Fri, Apr 25, 2014 at 10:31 AM, James Rothering <span dir=3D"ltr">&lt;<a =
href=3D"mailto:jrothering@codojo.me" target=3D"_blank">jrothering@codojo.me=
</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">What version of C* is this?=
</div><div><div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quot=
e">
On Fri, Apr 25, 2014 at 6:55 AM, Phil Burress <span dir=3D"ltr">&lt;<a href=
=3D"mailto:philburresseme@gmail.com" target=3D"_blank">philburresseme@gmail=
.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Just a follow-up on this fo=
r any interested parties. Ultimately we&#39;ve determined that the bootstra=
p/join process is broken in Cassandra. We ended up creating an entirely new=
 cluster and migrating the data.</div>


<div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Mon, Apr 2=
1, 2014 at 10:32 AM, Phil Burress <span dir=3D"ltr">&lt;<a href=3D"mailto:p=
hilburresseme@gmail.com" target=3D"_blank">philburresseme@gmail.com</a>&gt;=
</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">The new node has managed to=
 stay up without dying for about 24 hours now... but it still is in JOINING=
 state. A new concern has popped up. Disk usage is at 500GB on the new node=
. The three original nodes have about 40GB each. Any ideas why this is happ=
ening?</div>


<div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Sat, Apr 1=
9, 2014 at 9:19 PM, Phil Burress <span dir=3D"ltr">&lt;<a href=3D"mailto:ph=
ilburresseme@gmail.com" target=3D"_blank">philburresseme@gmail.com</a>&gt;<=
/span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><p dir=3D"ltr">Thank you all for your advice=
 and good info. The node has died a couple of times with out of memory erro=
rs. I&#39;ve restarted each time but it starts re - running compaction and =
then dies again.</p>


<p dir=3D"ltr">Is there a better way to do this?</p><div><div>
<div class=3D"gmail_quote">On Apr 18, 2014 6:06 PM, &quot;Steven A Robenalt=
&quot; &lt;<a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobe=
nal@stanford.edu</a>&gt; wrote:<br type=3D"attribution"><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


<div dir=3D"ltr">That&#39;s what I&#39;d be doing, but I wouldn&#39;t expec=
t it to run for 3 days this time. My guess is that whatever was going wrong=
 with the bootstrap when you had 3 nodes starting at once was interfering w=
ith the completion of the 1 remaining node of those 3. A clean bootstrap of=
 a single node should complete eventually, and I would think it&#39;ll be a=
 lot less than 3 days. Our database is much smaller than yours at the momen=
t, so I can&#39;t really guide you on how long it should take, but I&#39;d =
think that others on the list with similar database sizes might be able to =
give you a better idea.<div>


<br></div><div>Steve</div><div><br></div></div><div class=3D"gmail_extra"><=
br><br><div class=3D"gmail_quote">On Fri, Apr 18, 2014 at 1:43 PM, Phil Bur=
ress <span dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com" targ=
et=3D"_blank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">First, I just stopped 2 of =
the nodes and left one running. But this morning, I stopped that third node=
, cleared out the data, restarted and let it rejoin again. It appears strea=
ming is done (according to netstats), right now it appears to be running co=
mpaction and building secondary index (according to compactionstats). Just =
sit and wait I guess?</div>


<div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Fri, Apr 1=
8, 2014 at 2:23 PM, Steven A Robenalt <span dir=3D"ltr">&lt;<a href=3D"mail=
to:srobenal@stanford.edu" target=3D"_blank">srobenal@stanford.edu</a>&gt;</=
span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Looking back through this e=
mail chain, it looks like Phil said he wasn&#39;t using vnodes.<div><br></d=
iv>


<div>For the record, we are using vnodes since we brought up our first clus=
ter, and have not seen any issues with bootstrapping new nodes either to re=
place existing nodes, or to grow/shrink the cluster. We did adhere to the c=
aveats that new nodes should not be seed nodes, and that we should allow ea=
ch node to join the cluster completely before making any other changes.</di=
v>


<div><br></div><div>Phil, when you dropped to adding just the single node t=
o your cluster, did you start over with the newly added node (blowing away =
the database created on the previous startup), or did you shut down the oth=
er 2 added nodes and leave the remaining one in progress to continue?</div>


<div><br></div><div>Steve</div></div><div class=3D"gmail_extra"><div><div><=
br><br><div class=3D"gmail_quote">On Fri, Apr 18, 2014 at 10:40 AM, Robert =
Coli <span dir=3D"ltr">&lt;<a href=3D"mailto:rcoli@eventbrite.com" target=
=3D"_blank">rcoli@eventbrite.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra">=
<div class=3D"gmail_quote"><div>On Fri, Apr 18, 2014 at 5:05 AM, Phil Burre=
ss <span dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com" target=
=3D"_blank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">nodetool netstats shows 84 =
files. They are all at 100%. Nothing showing in Pending or Active for Read =
Repair Stats.<div>


<br></div><div>I&#39;m assuming this means it&#39;s done. But it still show=
s &quot;JOINING&quot;. Is there an undocumented step I&#39;m missing here? =
This whole process seems broken to me.</div></div></blockquote><div><br>


</div></div><div>Lately it seems like a lot more people than usual are :</d=
iv><div><br></div><div>1) using vnodes</div><div>2) unable to bootstrap new=
 nodes</div><div><br></div><div>If I were you, I would likely file a JIRA d=
etailing your negative experience with this core functionality.</div>


<div><br></div><div>=3DRob</div><div><br></div><div>=A0</div></div></div></=
div>
</blockquote></div><br><br clear=3D"all"><div><br></div></div></div><span><=
font color=3D"#888888">-- <br><div dir=3D"ltr"><div><font face=3D"verdana, =
sans-serif">Steve Robenalt<br></font></div><div><font face=3D"verdana, sans=
-serif">Software Architect<br>


</font></div><div>
<font face=3D"verdana, sans-serif">HighWire | Stanford University=A0</font>=
</div>

<div><span style=3D"font-family:verdana,sans-serif">425 Broadway St, Redwoo=
d City, CA 94063</span><font face=3D"verdana, sans-serif">=A0<br></font><di=
v style=3D"font-family:arial"><br></div></div><font face=3D"verdana, sans-s=
erif"><a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobenal@s=
tanford.edu</a> <br>


</font><div><font face=3D"verdana, sans-serif"><a href=3D"http://highwire.s=
tanford.edu" target=3D"_blank">http://highwire.stanford.edu</a>=A0</font></=
div>

<div><font face=3D"verdana, sans-serif"><br></font></div><div><div><font fa=
ce=3D"verdana, sans-serif"><i><br></i></font></div></div>

<div><br><br><div style=3D"font-family:arial">
<span style=3D"font-family:verdana,sans-serif;font-size:13px"><br></span></=
div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div dir=3D"ltr"><div><font face=3D"verdana, sans-serif">Steve Robenalt<br>=
</font></div><div><font face=3D"verdana, sans-serif">Software Architect<br>=
</font></div>


<div><font face=3D"verdana, sans-serif">HighWire | Stanford University=A0</=
font></div>

<div><span style=3D"font-family:verdana,sans-serif">425 Broadway St, Redwoo=
d City, CA 94063</span><font face=3D"verdana, sans-serif">=A0<br></font><di=
v style=3D"font-family:arial"><br></div></div><font face=3D"verdana, sans-s=
erif"><a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobenal@s=
tanford.edu</a> <br>


</font><div><font face=3D"verdana, sans-serif"><a href=3D"http://highwire.s=
tanford.edu" target=3D"_blank">http://highwire.stanford.edu</a>=A0</font></=
div>

<div><font face=3D"verdana, sans-serif"><br></font></div><div><div><font fa=
ce=3D"verdana, sans-serif"><i><br></i></font></div></div>

<div><br><br><div style=3D"font-family:arial">
<span style=3D"font-family:verdana,sans-serif;font-size:13px"><br></span></=
div></div></div>
</div>
</blockquote></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div dir=3D"ltr"><div><font face=3D"verdana, sans-serif">Steve Robenalt<br>=
</font></div><div><font face=3D"verdana, sans-serif">Software Architect<br>=
</font></div>
<div><font face=3D"verdana, sans-serif">HighWire | Stanford University=A0</=
font></div>

<div><span style=3D"font-family:verdana,sans-serif">425 Broadway St, Redwoo=
d City, CA 94063</span><font face=3D"verdana, sans-serif">=A0<br></font><di=
v style=3D"font-family:arial"><br></div></div><font face=3D"verdana, sans-s=
erif"><a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobenal@s=
tanford.edu</a> <br>
</font><div><font face=3D"verdana, sans-serif"><a href=3D"http://highwire.s=
tanford.edu" target=3D"_blank">http://highwire.stanford.edu</a>=A0</font></=
div>

<div><font face=3D"verdana, sans-serif"><br></font></div><div><div><font fa=
ce=3D"verdana, sans-serif"><i><br></i></font></div></div>

<div><br><br><div style=3D"font-family:arial">
<span style=3D"font-family:verdana,sans-serif;font-size:13px"><br></span></=
div></div></div>
</div>

--001a113311fc4f0bb504f7dfc02a--