Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of wojciech.meler@gmail.com
 designates 209.85.212.46 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <CD481C19.2080A%Dean.Hiller@nrel.gov>
References: <F7A2A5D9-8ECE-43EB-A030-ABEAE53EC4FD@thelastpickle.com>
	<CD481C19.2080A%Dean.Hiller@nrel.gov>
Date: Wed, 20 Feb 2013 23:31:30 +0100
Message-ID: 
 <CABiBYj7Z1FB2-FVfJP6PADykuGaWLew9+6C-w3e63B9smEkrtg@mail.gmail.com>
Subject: Re: cassandra vs. mongodb quick question(good additional info)
From: Wojciech Meler <wojciech.meler@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=20cf3071c7fc0fdf0b04d62f85de

--20cf3071c7fc0fdf0b04d62f85de
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

you have 86400 seconds a day so 42T could take less than 12 hours on 10Gb
link
19 lut 2013 02:01, "Hiller, Dean" <Dean.Hiller@nrel.gov> napisa=C5=82(a):

> I thought about this more, and even with a 10Gbit network, it would take
> 40 days to bring up a replacement node if mongodb did truly have a 42T /
> node like I had heard.  I wrote the below email to the person I heard thi=
s
> from going back to basics which really puts some perspective on it=E2=80=
=A6.(and a
> lot of people don't even have a 10Gbit network like we do)
>
> Nodes are hooked up by a 10G network at most right now where that is
> 10gigabit.  We are talking about 10Terabytes on disk per node recently.
>
> Google "10 gigabit in gigabytes" gives me 1.25 gigabytes/second  (yes I
> could have divided by 8 in my head but eh=E2=80=A6course when I saw the n=
umber, I
> went duh)
>
> So trying to transfer 10 Terabytes  or 10,000 Gigabytes to a node that we
> are bringing online to replace a dead node would take approximately 5
> days???
>
> This means no one else is using the bandwidth too ;).  10,000Gigabytes * =
1
> second/1.25 * 1hr/60secs * 1 day / 24 hrs =3D 5.555555 days.  This is mor=
e
> likely 11 days if we only use 50% of the network.
>
> So bringing a new node up to speed is more like 11 days once it is
> crashed.  I think this is the main reason the 1Terabyte exists to begin
> with, right?
>
> From an ops perspective, this could sound like a nightmare scenario of
> waiting 10 days=E2=80=A6..maybe it is livable though.  Either way, I thou=
ght it
> would be good to share the numbers.  ALSO, that is assuming the bus with
> it's 10 disk can keep up with 10G????  Can it?  What is the limit of
> throughput on a bus / second on the computers we have as on wikipedia the=
re
> is a huge variance?
>
> What is the rate of the disks too (multiplied by 10 of course)?  Will the=
y
> keep up with a 10G rate for bringing a new node online?
>
> This all comes into play even more so when you want to double the size of
> your cluster of course as all nodes have to transfer half of what they ha=
ve
> to all the new nodes that come online(cassandra actually has a very data
> center/rack aware topology to transfer data correctly to not use up all
> bandwidth unecessarily=E2=80=A6I am not sure mongodb has that).  Anyways,=
 just food
> for thought.
>
> From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.co=
m
> >>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Monday, February 18, 2013 1:39 PM
> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>, Vegard
> Berget <post@fantasista.no<mailto:post@fantasista.no>>
> Subject: Re: cassandra vs. mongodb quick question
>
> My experience is repair of 300GB compressed data takes longer than 300GB
> of uncompressed, but I cannot point to an exact number. Calculating the
> differences is mostly CPU bound and works on the non compressed data.
>
> Streaming uses compression (after uncompressing the on disk data).
>
> So if you have 300GB of compressed data, take a look at how long repair
> takes and see if you are comfortable with that. You may also want to test
> replacing a node so you can get the procedure documented and understand h=
ow
> long it takes.
>
> The idea of the soft 300GB to 500GB limit cam about because of a number o=
f
> cases where people had 1 TB on a single node and they were surprised it
> took days to repair or replace. If you know how long things may take, and
> that fits in your operations then go with it.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/02/2013, at 10:08 PM, Vegard Berget <post@fantasista.no<mailto:
> post@fantasista.no>> wrote:
>
>
>
> Just out of curiosity :
>
> When using compression, does this affect this one way or another?  Is 300=
G
> (compressed) SSTable size, or total size of data?
>
> .vegard,
>
> ----- Original Message -----
> From:
> user@cassandra.apache.org<mailto:user@cassandra.apache.org>
>
> To:
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Cc:
>
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
>
>
> If you have spinning disk and 1G networking and no virtual nodes, I would
> still say 300G to 500G is a soft limit.
>
> If you are using virtual nodes, SSD, JBOD disk configuration or faster
> networking you may go higher.
>
> The limiting factors are the time it take to repair, the time it takes to
> replace a node, the memory considerations for 100's of millions of rows. =
If
> you the performance of those operations is acceptable to you, then go cra=
zy.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com<http://www.thelastpickle.com/>
>
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <Dean.Hiller@nrel.gov<mailto:
> Dean.Hiller@nrel.gov>> wrote:
>
> So I found out mongodb varies their node size from 1T to 42T per node
> depending on the profile.  So if I was going to be writing a lot but rare=
ly
> changing rows, could I also use cassandra with a per node size of +20T or
> is that not advisable?
>
> Thanks,
> Dean
>
>
>

--20cf3071c7fc0fdf0b04d62f85de
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr">you have 86400 seconds a day so 42T could take less than 12 =
hours on 10Gb link</p>
<div class=3D"gmail_quote">19 lut 2013 02:01, &quot;Hiller, Dean&quot; &lt;=
<a href=3D"mailto:Dean.Hiller@nrel.gov">Dean.Hiller@nrel.gov</a>&gt; napisa=
=C5=82(a):<br type=3D"attribution"><blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I thought about this more, and even with a 10Gbit network, it would take 40=
 days to bring up a replacement node if mongodb did truly have a 42T / node=
 like I had heard. =C2=A0I wrote the below email to the person I heard this=
 from going back to basics which really puts some perspective on it=E2=80=
=A6.(and a lot of people don&#39;t even have a 10Gbit network like we do)<b=
r>

<br>
Nodes are hooked up by a 10G network at most right now where that is 10giga=
bit. =C2=A0We are talking about 10Terabytes on disk per node recently.<br>
<br>
Google &quot;10 gigabit in gigabytes&quot; gives me 1.25 gigabytes/second =
=C2=A0(yes I could have divided by 8 in my head but eh=E2=80=A6course when =
I saw the number, I went duh)<br>
<br>
So trying to transfer 10 Terabytes =C2=A0or 10,000 Gigabytes to a node that=
 we are bringing online to replace a dead node would take approximately 5 d=
ays???<br>
<br>
This means no one else is using the bandwidth too ;). =C2=A010,000Gigabytes=
 * 1 second/1.25 * 1hr/60secs * 1 day / 24 hrs =3D 5.555555 days. =C2=A0Thi=
s is more likely 11 days if we only use 50% of the network.<br>
<br>
So bringing a new node up to speed is more like 11 days once it is crashed.=
 =C2=A0I think this is the main reason the 1Terabyte exists to begin with, =
right?<br>
<br>
>From an ops perspective, this could sound like a nightmare scenario of wait=
ing 10 days=E2=80=A6..maybe it is livable though. =C2=A0Either way, I thoug=
ht it would be good to share the numbers. =C2=A0ALSO, that is assuming the =
bus with it&#39;s 10 disk can keep up with 10G???? =C2=A0Can it? =C2=A0What=
 is the limit of throughput on a bus / second on the computers we have as o=
n wikipedia there is a huge variance?<br>

<br>
What is the rate of the disks too (multiplied by 10 of course)? =C2=A0Will =
they keep up with a 10G rate for bringing a new node online?<br>
<br>
This all comes into play even more so when you want to double the size of y=
our cluster of course as all nodes have to transfer half of what they have =
to all the new nodes that come online(cassandra actually has a very data ce=
nter/rack aware topology to transfer data correctly to not use up all bandw=
idth unecessarily=E2=80=A6I am not sure mongodb has that). =C2=A0Anyways, j=
ust food for thought.<br>

<br>
From: aaron morton &lt;<a href=3D"mailto:aaron@thelastpickle.com">aaron@the=
lastpickle.com</a>&lt;mailto:<a href=3D"mailto:aaron@thelastpickle.com">aar=
on@thelastpickle.com</a>&gt;&gt;<br>
Reply-To: &quot;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra=
.apache.org</a>&lt;mailto:<a href=3D"mailto:user@cassandra.apache.org">user=
@cassandra.apache.org</a>&gt;&quot; &lt;<a href=3D"mailto:user@cassandra.ap=
ache.org">user@cassandra.apache.org</a>&lt;mailto:<a href=3D"mailto:user@ca=
ssandra.apache.org">user@cassandra.apache.org</a>&gt;&gt;<br>

Date: Monday, February 18, 2013 1:39 PM<br>
To: &quot;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apach=
e.org</a>&lt;mailto:<a href=3D"mailto:user@cassandra.apache.org">user@cassa=
ndra.apache.org</a>&gt;&quot; &lt;<a href=3D"mailto:user@cassandra.apache.o=
rg">user@cassandra.apache.org</a>&lt;mailto:<a href=3D"mailto:user@cassandr=
a.apache.org">user@cassandra.apache.org</a>&gt;&gt;, Vegard Berget &lt;<a h=
ref=3D"mailto:post@fantasista.no">post@fantasista.no</a>&lt;mailto:<a href=
=3D"mailto:post@fantasista.no">post@fantasista.no</a>&gt;&gt;<br>

Subject: Re: cassandra vs. mongodb quick question<br>
<br>
My experience is repair of 300GB compressed data takes longer than 300GB of=
 uncompressed, but I cannot point to an exact number. Calculating the diffe=
rences is mostly CPU bound and works on the non compressed data.<br>
<br>
Streaming uses compression (after uncompressing the on disk data).<br>
<br>
So if you have 300GB of compressed data, take a look at how long repair tak=
es and see if you are comfortable with that. You may also want to test repl=
acing a node so you can get the procedure documented and understand how lon=
g it takes.<br>

<br>
The idea of the soft 300GB to 500GB limit cam about because of a number of =
cases where people had 1 TB on a single node and they were surprised it too=
k days to repair or replace. If you know how long things may take, and that=
 fits in your operations then go with it.<br>

<br>
Cheers<br>
<br>
-----------------<br>
Aaron Morton<br>
Freelance Cassandra Developer<br>
New Zealand<br>
<br>
@aaronmorton<br>
<a href=3D"http://www.thelastpickle.com" target=3D"_blank">http://www.thela=
stpickle.com</a><br>
<br>
On 18/02/2013, at 10:08 PM, Vegard Berget &lt;<a href=3D"mailto:post@fantas=
ista.no">post@fantasista.no</a>&lt;mailto:<a href=3D"mailto:post@fantasista=
.no">post@fantasista.no</a>&gt;&gt; wrote:<br>
<br>
<br>
<br>
Just out of curiosity :<br>
<br>
When using compression, does this affect this one way or another? =C2=A0Is =
300G (compressed) SSTable size, or total size of data?<br>
<br>
.vegard,<br>
<br>
----- Original Message -----<br>
From:<br>
<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a>&=
lt;mailto:<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apach=
e.org</a>&gt;<br>
<br>
To:<br>
&lt;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org<=
/a>&lt;mailto:<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.a=
pache.org</a>&gt;&gt;<br>
Cc:<br>
<br>
Sent:<br>
Mon, 18 Feb 2013 08:41:25 +1300<br>
Subject:<br>
Re: cassandra vs. mongodb quick question<br>
<br>
<br>
If you have spinning disk and 1G networking and no virtual nodes, I would s=
till say 300G to 500G is a soft limit.<br>
<br>
If you are using virtual nodes, SSD, JBOD disk configuration or faster netw=
orking you may go higher.<br>
<br>
The limiting factors are the time it take to repair, the time it takes to r=
eplace a node, the memory considerations for 100&#39;s of millions of rows.=
 If you the performance of those operations is acceptable to you, then go c=
razy.<br>

<br>
Cheers<br>
<br>
-----------------<br>
Aaron Morton<br>
Freelance Cassandra Developer<br>
New Zealand<br>
<br>
@aaronmorton<br>
<a href=3D"http://www.thelastpickle.com" target=3D"_blank">http://www.thela=
stpickle.com</a>&lt;<a href=3D"http://www.thelastpickle.com/" target=3D"_bl=
ank">http://www.thelastpickle.com/</a>&gt;<br>
<br>
On 16/02/2013, at 9:05 AM, &quot;Hiller, Dean&quot; &lt;<a href=3D"mailto:D=
ean.Hiller@nrel.gov">Dean.Hiller@nrel.gov</a>&lt;mailto:<a href=3D"mailto:D=
ean.Hiller@nrel.gov">Dean.Hiller@nrel.gov</a>&gt;&gt; wrote:<br>
<br>
So I found out mongodb varies their node size from 1T to 42T per node depen=
ding on the profile. =C2=A0So if I was going to be writing a lot but rarely=
 changing rows, could I also use cassandra with a per node size of +20T or =
is that not advisable?<br>

<br>
Thanks,<br>
Dean<br>
<br>
<br>
</blockquote></div>

--20cf3071c7fc0fdf0b04d62f85de--