Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Date: Mon, 17 Oct 2016 15:01:40 -0400
From: Vladimir Yudovin <vladyu@winguzone.com>
To: "user" <user@cassandra.apache.org>
Message-Id: <157d4054e58.be159532437851.2039085735232966883@winguzone.com>
In-Reply-To: <CACUnPaDAVyC-Tbo30rp3mHEW59SUU-jQED7rEN=hK7bR2GGnKg@mail.gmail.com>
References: <CAMy13tDE1D4qtRjJyT+Vv5Nv-zRnWqARoc9NvPoKCS_Jix3o_w@mail.gmail.com>
 <157d35a4996.111bf0c3c419737.893769067162731191@winguzone.com>
 <CAD9GGnZ00iGabV9s2iu9N+A=d+soTpzN-RMnjX4UfifNTikEfQ@mail.gmail.com>
 <CAMy13tA3cZ++LaVnUsuwkwbR5tvBdhMEOqWij9nrWRODq42rLQ@mail.gmail.com>
 <CAKgmDnHMXb2sXrJ=rnGSE+ciXCPyUqLd+gJWWqYw6kdGx=0ksA@mail.gmail.com>
 <CAMy13tCHhN5duSHWaJHtrpPxjtCEBtKbNTyR4jGbHRxJt78W-Q@mail.gmail.com> <157d3f92405.afdb1151436704.4323668014528777643@winguzone.com> <CACUnPaDAVyC-Tbo30rp3mHEW59SUU-jQED7rEN=hK7bR2GGnKg@mail.gmail.com>
Subject: Re: Adding disk capacity to a running node
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_1360365_2098078878.1476730900066"
User-Agent: Zoho Mail
archived-at: Mon, 17 Oct 2016 19:01:51 -0000

------=_Part_1360365_2098078878.1476730900066
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit

But after such restart node should be joined to cluster again and restore data, right?


Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.


---- On Mon, 17 Oct 2016 14:55:49 -0400Jonathan Haddad &lt;jon@jonhaddad.com&gt; wrote ----


Vladimir,


*Most* people are running Cassandra are doing so using ephemeral disks.  Instances are not arbitrarily moved to different hosts.  Yes, instances can be shut down, but that's why you distribute across AZs.  


On Mon, Oct 17, 2016 at 11:48 AM Vladimir Yudovin &lt;vladyu@winguzone.com&gt; wrote:


It's extremely unreliable to use ephemeral (local) disks. Even if you don't stop instance by yourself, it can be restarted on different server in case of some hardware failure or AWS initiated update. So all node data will be lost.


Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.


---- On Mon, 17 Oct 2016 14:45:00 -0400Seth Edwards &lt;seth@pubnub.com&gt; wrote ----


These are i2.2xlarge instances so the disks currently configured as ephemeral dedicated disks. 


On Mon, Oct 17, 2016 at 11:34 AM, Laing, Michael &lt;michael.laing@nytimes.com&gt; wrote:


You could just expand the size of your ebs volume and extend the file system. No data is lost - assuming you are running Linux.


On Monday, October 17, 2016, Seth Edwards &lt;seth@pubnub.com&gt; wrote:

We're running 2.0.16. We're migrating to a new data model but we've had an unexpected increase in write traffic that has caused us some capacity issues when we encounter compactions. Our old data model is on STCS. We'd like to add another ebs volume (we're on aws) to our JBOD config and hopefully avoid any situation where we run out of disk space during a large compaction. It appears that the behavior we are hoping to get is actually undesirable and removed in 3.2. It still might be an option for us until we can finish the migration. 


I'm not familiar with LVM so it may be a bit risky to try at this point. 


On Mon, Oct 17, 2016 at 9:42 AM, Yabin Meng &lt;yabinmeng@gmail.com&gt; wrote:

I assume you're talking about Cassandra JBOD (just a bunch of disk) setup because you do mention it as adding it to the list of data directories. If this is the case, you may run into issues, depending on your C* version. Check this out: http://www.datastax.com/dev/blog/improving-jbod.


Or another approach is to use LVM to manage multiple devices into a single mount point. If you do so, from what Cassandra can see is just simply increased disk storage space and there should should have no problem.


Hope this helps,


Yabin


On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin &lt;vladyu@winguzone.com&gt; wrote:


Yes, Cassandra should keep percent of disk usage equal for all disk. Compaction process and SSTable flushes will use new disk to distribute both new and existing data.


Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.


---- On Mon, 17 Oct 2016 11:43:27 -0400Seth Edwards &lt;seth@pubnub.com&gt; wrote ----


We have a few nodes that are running out of disk capacity at the moment and instead of adding more nodes to the cluster, we would like to add another disk to the server and add it to the list of data directories. My question, is, will Cassandra use the new disk for compactions on sstables that already exist in the primary directory? 


Thanks!


------=_Part_1360365_2098078878.1476730900066
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head>=
<meta content=3D"text/html;charset=3DUTF-8" http-equiv=3D"Content-Type"></h=
ead><body ><div style=3D'font-size:10pt;font-family:Verdana,Arial,Helvetica=
,sans-serif;'><div>But after such restart node should be joined to cluster =
again and restore data, right?<br></div><div><br></div><div id=3D""><div>Be=
st regards, Vladimir Yudovin, <br></div><div><i><a target=3D"_blank" href=
=3D"https://winguzone.com?from=3Dlist">Winguzone</a> - Hosted Cloud Cassand=
ra on Azure and SoftLayer.<br>Launch your cluster in minutes.<br></i></div>=
</div><div><br></div><div class=3D"zmail_extra"><div id=3D"1"><div><br></di=
v><div>---- On Mon, 17 Oct 2016 14:55:49 -0400<b>Jonathan Haddad &lt;jon@jo=
nhaddad.com&gt;</b> wrote ----<br></div></div><div><br></div><blockquote st=
yle=3D"border-left: 1px solid #cccccc; padding-left: 6px; margin:0 0 0 5px"=
><div><div dir=3D"ltr"><div>Vladimir,<br></div><div><br></div><div>*Most* p=
eople are running Cassandra are doing so using ephemeral disks.&nbsp; Insta=
nces are not arbitrarily moved to different hosts.&nbsp; Yes, instances can=
 be shut down, but that's why you distribute across AZs. &nbsp;<br></div></=
div><div><br></div><div><div dir=3D"ltr">On Mon, Oct 17, 2016 at 11:48 AM V=
ladimir Yudovin &lt;<a href=3D"mailto:vladyu@winguzone.com" target=3D"_blan=
k">vladyu@winguzone.com</a>&gt; wrote:<br></div><div id=3D"zmail_block"><br=
></div></div></div><blockquote style=3D"margin: 0 0 0 0.8ex;border-left: 1.=
0px rgb(204,204,204) solid;padding-left: 1.0ex;"><div><u></u><br></div><div=
><div style=3D"font-size: 10.0pt;font-family: Verdana , Arial , Helvetica ,=
 sans-serif;"><div>It's extremely unreliable to use ephemeral (local) disks=
. Even if you don't stop instance by yourself, it can be restarted on diffe=
rent server in case of some hardware failure or AWS initiated update. So al=
l node data will be lost.<br></div></div></div><div><div style=3D"font-size=
: 10.0pt;font-family: Verdana , Arial , Helvetica , sans-serif;"><div><br><=
/div><div><div>Best regards, Vladimir Yudovin, <br></div><div><i><a href=3D=
"https://winguzone.com?from=3Dlist" target=3D"_blank">Winguzone</a> - Hoste=
d Cloud Cassandra on Azure and SoftLayer.<br>Launch your cluster in minutes=
.</i></div></div><div><br></div></div></div><div><div style=3D"font-size: 1=
0.0pt;font-family: Verdana , Arial , Helvetica , sans-serif;"><div><div><di=
v><br></div><div>---- On Mon, 17 Oct 2016 14:45:00 -0400<b>Seth Edwards &lt=
;<a href=3D"mailto:seth@pubnub.com" target=3D"_blank">seth@pubnub.com</a>&g=
t;</b> wrote ----<br></div></div></div></div></div><div><div style=3D"font-=
size: 10.0pt;font-family: Verdana , Arial , Helvetica , sans-serif;"><div><=
div><br></div><blockquote style=3D"border-left: 1.0px solid rgb(204,204,204=
);padding-left: 6.0px;margin: 0 0 0 5.0px;"><div><div dir=3D"ltr">These are=
 i2.2xlarge instances so the disks currently configured as ephemeral dedica=
ted disks.&nbsp;<br></div><div><div><br></div><div><div>On Mon, Oct 17, 201=
6 at 11:34 AM, Laing, Michael <span>&lt;<a href=3D"mailto:michael.laing@nyt=
imes.com" target=3D"_blank">michael.laing@nytimes.com</a>&gt;</span> wrote:=
<br></div><div><br></div></div></div></div><blockquote style=3D"margin: 0 0=
 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;"><d=
iv>You could just expand the size of your ebs&nbsp;volume and extend the fi=
le system. No data is lost - assuming you are running Linux.<br></div><div>=
<div><div><span></span><br></div><div><br></div><div>On Monday, October 17,=
 2016, Seth Edwards &lt;<a href=3D"mailto:seth@pubnub.com" target=3D"_blank=
">seth@pubnub.com</a>&gt; wrote:<br></div><blockquote style=3D"margin: 0 0 =
0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;"><di=
v dir=3D"ltr"><div>We're running 2.0.16. We're migrating to a new data mode=
l but we've had an unexpected increase in write traffic that has caused us =
some capacity issues when we encounter compactions. Our old data model is o=
n STCS. We'd like to add another ebs volume (we're on aws) to our JBOD conf=
ig and hopefully avoid any situation where we run out of disk space during =
a large compaction. It appears that the behavior we are hoping to get is ac=
tually undesirable and removed in 3.2. It still might be an option for us u=
ntil we can finish the migration.&nbsp;<br></div><div><br></div><div>I'm no=
t familiar with LVM so it may be a bit risky to try at this point.&nbsp;<br=
></div></div><div><div><br></div><div><div>On Mon, Oct 17, 2016 at 9:42 AM,=
 Yabin Meng <span>&lt;<a target=3D"_blank">yabinmeng@gmail.com</a>&gt;</spa=
n> wrote:<br></div><blockquote style=3D"margin: 0 0 0 0.8ex;border-left: 1.=
0px rgb(204,204,204) solid;padding-left: 1.0ex;"><div dir=3D"ltr"><div>I as=
sume you're talking about Cassandra JBOD (just a bunch of disk) setup becau=
se you do mention it as adding it to the list of data directories. If this =
is the case, you may run into issues, depending on your C* version. Check t=
his out:&nbsp;<a href=3D"http://www.datastax.com/dev/blog/improving-jbod" t=
arget=3D"_blank">http://www.datastax.com/dev/blog/improving-jbod</a>.<br></=
div><div><br></div><div>Or another approach is to use LVM to manage multipl=
e devices into a single mount point. If you do so, from what Cassandra can =
see is just simply increased disk storage space and there should should hav=
e no problem.<br></div><div><br></div><div>Hope this helps,<br></div><div><=
br></div><div>Yabin<br></div></div><div><div><div><div><br></div><div><div>=
On Mon, Oct 17, 2016 at 11:54 AM, Vladimir Yudovin <span>&lt;<a target=3D"_=
blank">vladyu@winguzone.com</a>&gt;</span> wrote:<br></div><blockquote styl=
e=3D"margin: 0 0 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-=
left: 1.0ex;"><div><u></u><br></div><div><div style=3D"font-size: 10.0pt;fo=
nt-family: Verdana , Arial , Helvetica , sans-serif;"><div>Yes, Cassandra s=
hould keep percent of disk usage equal for all disk. Compaction process and=
 SSTable flushes will use new disk to distribute both new and existing data=
.<br></div><div><br></div><div><div>Best regards, Vladimir Yudovin, <br></d=
iv><div><i><a href=3D"https://winguzone.com?from=3Dlist" target=3D"_blank">=
Winguzone</a> - Hosted Cloud Cassandra on Azure and SoftLayer.<br>Launch yo=
ur cluster in minutes.</i></div></div><div><br></div><div><div><div><br></d=
iv><div>---- On Mon, 17 Oct 2016 11:43:27 -0400<b>Seth Edwards &lt;<a targe=
t=3D"_blank">seth@pubnub.com</a>&gt;</b> wrote ----<br></div></div><div><sp=
an><div><br></div><blockquote style=3D"border-left: 1.0px solid rgb(204,204=
,204);padding-left: 6.0px;margin: 0 0 0 5.0px;"><div><div dir=3D"ltr"><div>=
We have a few nodes that are running out of disk capacity at the moment and=
 instead of adding more nodes to the cluster, we would like to add another =
disk to the server and add it to the list of data directories. My question,=
 is, will Cassandra use the new disk for compactions on sstables that alrea=
dy exist in the primary directory?&nbsp;<br></div><div><br></div><div><br><=
/div><div><br></div><div>Thanks!<br></div></div></div></blockquote></span><=
br></div></div><div><br></div></div></div></blockquote></div></div></div></=
div></blockquote></div></div></blockquote></div></div></blockquote></blockq=
uote></div></div></div></blockquote></blockquote></div><div><br></div></div=
></body></html>
------=_Part_1360365_2098078878.1476730900066--