Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of philburresseme@gmail.com
 designates 209.85.192.53 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CADT6p3=b6kHRErR=zOO9=RbZwviHmCYC-ngF3m-ruiKAZ4Od4g@mail.gmail.com>
References: 
 <CADT6p3k12Q-iNE=4J91r0MOOQ36yiHFfpziSHo9ctuwUa0Rj+g@mail.gmail.com>
	<CAEDUwd0GRkGbqJjAss=zXfEd4k36+Akv7ex04zgdgp+C7-hQSg@mail.gmail.com>
	<CADT6p3k0JJ26iZMNNoBDLtwAYx--CT6U6aEuyd3E=3fAt_drYw@mail.gmail.com>
	<CAEDUwd1DBTb6p6ustAwgkTZd-38oy+LpgZ3iCZerJd_P26tzjg@mail.gmail.com>
	<CADT6p3kh6v15eDRBzhh4aY8mk3JO2zd4m9x6WDH167Ec9H3KdQ@mail.gmail.com>
	<CADT6p3=Ohnh93F=KAhjqCPHsZ5qvKGK3y-mAMf+OMqaaQrR23w@mail.gmail.com>
	<CAEDUwd2itR_D909OQmBja0-j=DjFA7WuE7EpwY8CsiGAbGHs0Q@mail.gmail.com>
	<CADT6p3mTP3As4YzR_BQ-KiVq_816pOWXtEqPE-dfc+b6h_O=1A@mail.gmail.com>
	<CAEDUwd3yhinyvH-Mw=ZKAf6Z8Zoh20U71SSihyjcAPrXYpLfjw@mail.gmail.com>
	<CAGZhckZ5H-n03KOpVoAJ4TExr0Fu6Laju9TXdHcD=fgLoMMq-w@mail.gmail.com>
	<CADT6p3kEx+L_gi44aJgPp93FrHvQAkVty+ZoOCYm+7phho1X-Q@mail.gmail.com>
	<CAGZhckbYb1qiJqAT7-68kB=Aa0E_qNEW0a8w3AktQR_Fcexa5g@mail.gmail.com>
	<CADT6p3=b6kHRErR=zOO9=RbZwviHmCYC-ngF3m-ruiKAZ4Od4g@mail.gmail.com>
Date: Mon, 21 Apr 2014 10:32:20 -0400
Message-ID: 
 <CADT6p3mfsBzqE04z+bfsUnrBKiosHApYXoV-qOMhzx8iL1cAPQ@mail.gmail.com>
Subject: Re: Bootstrap Timing
From: Phil Burress <philburresseme@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=047d7bdc8e10f9e27904f78e5d5d

--047d7bdc8e10f9e27904f78e5d5d
Content-Type: text/plain; charset=UTF-8

The new node has managed to stay up without dying for about 24 hours now...
but it still is in JOINING state. A new concern has popped up. Disk usage
is at 500GB on the new node. The three original nodes have about 40GB each.
Any ideas why this is happening?


On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress <philburresseme@gmail.com>wrote:

> Thank you all for your advice and good info. The node has died a couple of
> times with out of memory errors. I've restarted each time but it starts re
> - running compaction and then dies again.
>
> Is there a better way to do this?
> On Apr 18, 2014 6:06 PM, "Steven A Robenalt" <srobenal@stanford.edu>
> wrote:
>
>> That's what I'd be doing, but I wouldn't expect it to run for 3 days this
>> time. My guess is that whatever was going wrong with the bootstrap when you
>> had 3 nodes starting at once was interfering with the completion of the 1
>> remaining node of those 3. A clean bootstrap of a single node should
>> complete eventually, and I would think it'll be a lot less than 3 days. Our
>> database is much smaller than yours at the moment, so I can't really guide
>> you on how long it should take, but I'd think that others on the list with
>> similar database sizes might be able to give you a better idea.
>>
>> Steve
>>
>>
>>
>> On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress <philburresseme@gmail.com>wrote:
>>
>>> First, I just stopped 2 of the nodes and left one running. But this
>>> morning, I stopped that third node, cleared out the data, restarted and let
>>> it rejoin again. It appears streaming is done (according to netstats),
>>> right now it appears to be running compaction and building secondary index
>>> (according to compactionstats). Just sit and wait I guess?
>>>
>>>
>>> On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt <
>>> srobenal@stanford.edu> wrote:
>>>
>>>> Looking back through this email chain, it looks like Phil said he
>>>> wasn't using vnodes.
>>>>
>>>> For the record, we are using vnodes since we brought up our first
>>>> cluster, and have not seen any issues with bootstrapping new nodes either
>>>> to replace existing nodes, or to grow/shrink the cluster. We did adhere to
>>>> the caveats that new nodes should not be seed nodes, and that we should
>>>> allow each node to join the cluster completely before making any other
>>>> changes.
>>>>
>>>> Phil, when you dropped to adding just the single node to your cluster,
>>>> did you start over with the newly added node (blowing away the database
>>>> created on the previous startup), or did you shut down the other 2 added
>>>> nodes and leave the remaining one in progress to continue?
>>>>
>>>> Steve
>>>>
>>>>
>>>> On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli <rcoli@eventbrite.com>wrote:
>>>>
>>>>> On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress <
>>>>> philburresseme@gmail.com> wrote:
>>>>>
>>>>>> nodetool netstats shows 84 files. They are all at 100%. Nothing
>>>>>> showing in Pending or Active for Read Repair Stats.
>>>>>>
>>>>>> I'm assuming this means it's done. But it still shows "JOINING". Is
>>>>>> there an undocumented step I'm missing here? This whole process seems
>>>>>> broken to me.
>>>>>>
>>>>>
>>>>> Lately it seems like a lot more people than usual are :
>>>>>
>>>>> 1) using vnodes
>>>>> 2) unable to bootstrap new nodes
>>>>>
>>>>> If I were you, I would likely file a JIRA detailing your negative
>>>>> experience with this core functionality.
>>>>>
>>>>> =Rob
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Steve Robenalt
>>>> Software Architect
>>>>  HighWire | Stanford University
>>>> 425 Broadway St, Redwood City, CA 94063
>>>>
>>>> srobenal@stanford.edu
>>>> http://highwire.stanford.edu
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Steve Robenalt
>> Software Architect
>> HighWire | Stanford University
>> 425 Broadway St, Redwood City, CA 94063
>>
>> srobenal@stanford.edu
>> http://highwire.stanford.edu
>>
>>
>>
>>
>>
>>

--047d7bdc8e10f9e27904f78e5d5d
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">The new node has managed to stay up without dying for abou=
t 24 hours now... but it still is in JOINING state. A new concern has poppe=
d up. Disk usage is at 500GB on the new node. The three original nodes have=
 about 40GB each. Any ideas why this is happening?</div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Sat, Apr 1=
9, 2014 at 9:19 PM, Phil Burress <span dir=3D"ltr">&lt;<a href=3D"mailto:ph=
ilburresseme@gmail.com" target=3D"_blank">philburresseme@gmail.com</a>&gt;<=
/span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><p dir=3D"ltr">Thank you all for your advice=
 and good info. The node has died a couple of times with out of memory erro=
rs. I&#39;ve restarted each time but it starts re - running compaction and =
then dies again.</p>

<p dir=3D"ltr">Is there a better way to do this?</p><div class=3D"HOEnZb"><=
div class=3D"h5">
<div class=3D"gmail_quote">On Apr 18, 2014 6:06 PM, &quot;Steven A Robenalt=
&quot; &lt;<a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobe=
nal@stanford.edu</a>&gt; wrote:<br type=3D"attribution"><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">

<div dir=3D"ltr">That&#39;s what I&#39;d be doing, but I wouldn&#39;t expec=
t it to run for 3 days this time. My guess is that whatever was going wrong=
 with the bootstrap when you had 3 nodes starting at once was interfering w=
ith the completion of the 1 remaining node of those 3. A clean bootstrap of=
 a single node should complete eventually, and I would think it&#39;ll be a=
 lot less than 3 days. Our database is much smaller than yours at the momen=
t, so I can&#39;t really guide you on how long it should take, but I&#39;d =
think that others on the list with similar database sizes might be able to =
give you a better idea.<div>


<br></div><div>Steve</div><div><br></div></div><div class=3D"gmail_extra"><=
br><br><div class=3D"gmail_quote">On Fri, Apr 18, 2014 at 1:43 PM, Phil Bur=
ress <span dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com" targ=
et=3D"_blank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">First, I just stopped 2 of =
the nodes and left one running. But this morning, I stopped that third node=
, cleared out the data, restarted and let it rejoin again. It appears strea=
ming is done (according to netstats), right now it appears to be running co=
mpaction and building secondary index (according to compactionstats). Just =
sit and wait I guess?</div>


<div><div>
<div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Fri, Apr 1=
8, 2014 at 2:23 PM, Steven A Robenalt <span dir=3D"ltr">&lt;<a href=3D"mail=
to:srobenal@stanford.edu" target=3D"_blank">srobenal@stanford.edu</a>&gt;</=
span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">Looking back through this e=
mail chain, it looks like Phil said he wasn&#39;t using vnodes.<div><br></d=
iv>


<div>For the record, we are using vnodes since we brought up our first clus=
ter, and have not seen any issues with bootstrapping new nodes either to re=
place existing nodes, or to grow/shrink the cluster. We did adhere to the c=
aveats that new nodes should not be seed nodes, and that we should allow ea=
ch node to join the cluster completely before making any other changes.</di=
v>


<div><br></div><div>Phil, when you dropped to adding just the single node t=
o your cluster, did you start over with the newly added node (blowing away =
the database created on the previous startup), or did you shut down the oth=
er 2 added nodes and leave the remaining one in progress to continue?</div>


<div><br></div><div>Steve</div></div><div class=3D"gmail_extra"><div><div><=
br><br><div class=3D"gmail_quote">On Fri, Apr 18, 2014 at 10:40 AM, Robert =
Coli <span dir=3D"ltr">&lt;<a href=3D"mailto:rcoli@eventbrite.com" target=
=3D"_blank">rcoli@eventbrite.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_extra">=
<div class=3D"gmail_quote"><div>On Fri, Apr 18, 2014 at 5:05 AM, Phil Burre=
ss <span dir=3D"ltr">&lt;<a href=3D"mailto:philburresseme@gmail.com" target=
=3D"_blank">philburresseme@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">nodetool netstats shows 84 =
files. They are all at 100%. Nothing showing in Pending or Active for Read =
Repair Stats.<div>


<br></div><div>I&#39;m assuming this means it&#39;s done. But it still show=
s &quot;JOINING&quot;. Is there an undocumented step I&#39;m missing here? =
This whole process seems broken to me.</div></div></blockquote><div><br>


</div></div><div>Lately it seems like a lot more people than usual are :</d=
iv><div><br></div><div>1) using vnodes</div><div>2) unable to bootstrap new=
 nodes</div><div><br></div><div>If I were you, I would likely file a JIRA d=
etailing your negative experience with this core functionality.</div>


<div><br></div><div>=3DRob</div><div><br></div><div>=C2=A0</div></div></div=
></div>
</blockquote></div><br><br clear=3D"all"><div><br></div></div></div><span><=
font color=3D"#888888">-- <br><div dir=3D"ltr"><div><font face=3D"verdana, =
sans-serif">Steve Robenalt<br></font></div><div><font face=3D"verdana, sans=
-serif">Software Architect<br>


</font></div><div>
<font face=3D"verdana, sans-serif">HighWire | Stanford University=C2=A0</fo=
nt></div>

<div><span style=3D"font-family:verdana,sans-serif">425 Broadway St, Redwoo=
d City, CA 94063</span><font face=3D"verdana, sans-serif">=C2=A0<br></font>=
<div style=3D"font-family:arial"><br></div></div><font face=3D"verdana, san=
s-serif"><a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobena=
l@stanford.edu</a> <br>


</font><div><font face=3D"verdana, sans-serif"><a href=3D"http://highwire.s=
tanford.edu" target=3D"_blank">http://highwire.stanford.edu</a>=C2=A0</font=
></div>

<div><font face=3D"verdana, sans-serif"><br></font></div><div><div><font fa=
ce=3D"verdana, sans-serif"><i><br></i></font></div></div>

<div><br><br><div style=3D"font-family:arial">
<span style=3D"font-family:verdana,sans-serif;font-size:13px"><br></span></=
div></div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div dir=3D"ltr"><div><font face=3D"verdana, sans-serif">Steve Robenalt<br>=
</font></div><div><font face=3D"verdana, sans-serif">Software Architect<br>=
</font></div>


<div><font face=3D"verdana, sans-serif">HighWire | Stanford University=C2=
=A0</font></div>

<div><span style=3D"font-family:verdana,sans-serif">425 Broadway St, Redwoo=
d City, CA 94063</span><font face=3D"verdana, sans-serif">=C2=A0<br></font>=
<div style=3D"font-family:arial"><br></div></div><font face=3D"verdana, san=
s-serif"><a href=3D"mailto:srobenal@stanford.edu" target=3D"_blank">srobena=
l@stanford.edu</a> <br>


</font><div><font face=3D"verdana, sans-serif"><a href=3D"http://highwire.s=
tanford.edu" target=3D"_blank">http://highwire.stanford.edu</a>=C2=A0</font=
></div>

<div><font face=3D"verdana, sans-serif"><br></font></div><div><div><font fa=
ce=3D"verdana, sans-serif"><i><br></i></font></div></div>

<div><br><br><div style=3D"font-family:arial">
<span style=3D"font-family:verdana,sans-serif;font-size:13px"><br></span></=
div></div></div>
</div>
</blockquote></div>
</div></div></blockquote></div><br></div>

--047d7bdc8e10f9e27904f78e5d5d--