Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of springrider@gmail.com
 designates 209.85.215.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <B6AAAC51-1621-4712-A6AF-6FD5D0EA0C88@thelastpickle.com>
References: 
 <CAOA66tEvX=OQtq3_pBajgGb1hWg+UF4XXBkGvC-p6ss7+vzgTg@mail.gmail.com>
 <A37E54DF-9223-4566-90F7-D89E26ED0B4E@thelastpickle.com>
 <CAOA66tE0ZJaxt5fJHo74=z77p9By4n71s2Y3=HWQHq54z3teEA@mail.gmail.com>
 <CA+2nF5YjdQoaLk4k7Yy=kNz-P1OvwiJ4VNh93fLHtnaSMJX1xQ@mail.gmail.com>
 <B6AAAC51-1621-4712-A6AF-6FD5D0EA0C88@thelastpickle.com>
From: Yan Chunlu <springrider@gmail.com>
Date: Sun, 10 Jul 2011 14:54:48 +0800
Message-ID: 
 <CAOA66tFEZHw1w2tM1PydpqGs-H3GR7E0pDNhRVOVd=-KTSxDtw@mail.gmail.com>
Subject: Re: how large cassandra could scale when it need to do manual
 operation?
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=0015174be78217105004a7b18cf6

--0015174be78217105004a7b18cf6
Content-Type: text/plain; charset=ISO-8859-1

I missed the consistency level part, thanks very much for the explanation.
that is clear enough.

On Sun, Jul 10, 2011 at 7:57 AM, aaron morton <aaron@thelastpickle.com>wrote:

> about the decommission problem, here is the link:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-decommission-two-slow-nodes-td5078455.html
>
> The key part of that post is "and since the second node was under heavy
> load, and not enough ram, it was busy GCing and worked horribly slow" .
>
> maybe I was misunderstanding the replication factor, doesn't it RF=3 means
> I could lose two nodes and still have one available(with 100% of the keys),
> once Nodes>=3?
>
> When you start losing replicas the CL you use dictates if the cluster is
> still up for 100% of the keys. See
> http://thelastpickle.com/2011/06/13/Down-For-Me/
>
>  I have the strong willing to set RF to a very high value...
>
> As chris said 3 is about normal, it means the QUORUM CL is only 2 nodes.
>
> I am also trying to deploy cassandra across two datacenters(with 20ms
>> latency).
>>
> Lookup LOCAL_QUORUM in the wiki
>
> Hope that helps.
>
>  -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 9 Jul 2011, at 02:01, Chris Goffinet wrote:
>
> As mentioned by Aaron, yes we run hundreds of Cassandra nodes across
> multiple clusters. We run with RF of 2 and 3 (most common).
>
> We use commodity hardware and see failure all the time at this scale. We've
> never had 3 nodes that were in same replica set, fail all at once. We
> mitigate risk by being rack diverse, using different vendors for our hard
> drives, designed workflows to make sure machines get serviced in certain
> time windows and have an extensive automated burn-in process of (disk,
> memory, drives) to not roll out nodes/clusters that could fail right away.
>
> On Sat, Jul 9, 2011 at 12:17 AM, Yan Chunlu <springrider@gmail.com> wrote:
>
>> thank you very much for the reply. which brings me more confidence on
>> cassandra.
>> I will try the automation tools, the examples you've listed seems quite
>> promising!
>>
>>
>> about the decommission problem, here is the link:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-decommission-two-slow-nodes-td5078455.html
>>  I am also trying to deploy cassandra across two datacenters(with 20ms
>> latency). so I am worrying about the network latency will even make it
>> worse.
>>
>> maybe I was misunderstanding the replication factor, doesn't it RF=3 means
>> I could lose two nodes and still have one available(with 100% of the keys),
>> once Nodes>=3?   besides I am not sure what's twitters setting on RF, but it
>> is possible to lose 3 nodes in the same time(facebook once encountered photo
>> loss because there RAID broken, rarely happen though). I have the strong
>> willing to set RF to a very high value...
>>
>> Thanks!
>>
>>
>> On Sat, Jul 9, 2011 at 5:22 AM, aaron morton <aaron@thelastpickle.com>wrote:
>>
>>> AFAIK Facebook Cassandra and Apache Cassandra diverged paths a long time
>>> ago. Twitter is a vocal supporter with a large Apache Cassandra install,
>>> e.g. "Twitter currently runs a couple hundred Cassandra nodes across a half
>>> dozen clusters. "
>>> http://www.datastax.com/2011/06/chris-goffinet-of-twitter-to-speak-at-cassandra-sf-2011
>>>
>>>
>>>
>>> <http://www.datastax.com/2011/06/chris-goffinet-of-twitter-to-speak-at-cassandra-sf-2011>If
>>> you are working with a 3 node cluster removing/rebuilding/what ever one node
>>> will effect 33% of your capacity. When you scale up the contribution from
>>> each individual node goes down, and the impact of one node going down is
>>> less. Problems that happen with a few nodes will go away at scale, to be
>>> replaced by a whole set of new ones.
>>>
>>>
>>> 1):  the load balance need to manually performed on every node, according
>>> to:
>>>
>>> Yes
>>>
>>> 2): when adding new nodes, need to perform node repair and cleanup on
>>> every node
>>>
>>> You only need to run cleanup, see
>>> http://wiki.apache.org/cassandra/Operations#Bootstrap
>>>
>>> 3) when decommission a node, there is a chance that slow down the entire
>>> cluster. (not sure why but I saw people ask around about it.) and the only
>>> way to do is shutdown the entire the cluster, rsync the data, and start all
>>> nodes without the decommission one.
>>>
>>> I cannot remember any specific cases where decommission requires a full
>>> cluster stop, do you have a link? With regard to slowing down, the
>>> decommission process will stream data from the node you are removing onto
>>> the other nodes this can slow down the target node (I think it's more
>>> intelligent now about what is moved). This will be exaggerated in a 3 node
>>> cluster as you are removing 33% of the processing and adding some
>>> (temporary) extra load to the remaining nodes.
>>>
>>> after all, I think there is alot of human work to do to maintain the
>>> cluster which make it impossible to scale to thousands of nodes,
>>>
>>> Automation, Automation, Automation is the only way to go.
>>>
>>> Chef, Puppet, CF Engine for general config and deployment; Cloud Kick,
>>> munin, ganglia etc for monitoring. And
>>> Ops Centre (http://www.datastax.com/products/opscenter) for cassandra
>>> specific management.
>>>
>>> I am totally wrong about all of this, currently I am serving 1 millions
>>> pv every day with Cassandra and it make me feel unsafe, I am afraid one day
>>> one node crash will cause the data broken and all cluster goes wrong....
>>>
>>> With RF3 and a 3Node cluster you have room to lose one node and the
>>> cluster will be up for 100% of the keys. While better than having to worry
>>> about *the* database server, it's still entry level fault tolerance. With RF
>>> 3 in a 6 Node cluster you can lose up to 2 nodes and still be up for 100% of
>>> the keys.
>>>
>>> Is there something you are specifically concerned about with your current
>>> installation ?
>>>
>>> Cheers
>>>
>>>   -----------------
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 8 Jul 2011, at 08:50, Yan Chunlu wrote:
>>>
>>> hi, all:
>>> I am curious about how large that Cassandra can scale?
>>>
>>> from the information I can get, the largest usage is at facebook, which
>>> is about 150 nodes.  in the mean time they are using 2000+ nodes with
>>> Hadoop, and yahoo even using 4000 nodes of Hadoop.
>>>
>>> I am not understand why is the situation, I only have  little knowledge
>>> with Cassandra and even no knowledge with Hadoop.
>>>
>>>
>>>
>>> currently I am using cassandra with 3 nodes and having problem bring one
>>> back after it out of sync, the problems I encountered making me worry about
>>> how cassandra could scale out:
>>>
>>> 1):  the load balance need to manually performed on every node, according
>>> to:
>>>
>>> def tokens(nodes):
>>>
>>> for x in xrange(nodes):
>>>
>>> print 2 ** 127 / nodes * x
>>>
>>>
>>>
>>> 2): when adding new nodes, need to perform node repair and cleanup on
>>> every node
>>>
>>>
>>>
>>> 3) when decommission a node, there is a chance that slow down the entire
>>> cluster. (not sure why but I saw people ask around about it.) and the only
>>> way to do is shutdown the entire the cluster, rsync the data, and start all
>>> nodes without the decommission one.
>>>
>>>
>>>
>>>
>>>
>>> after all, I think there is alot of human work to do to maintain the
>>> cluster which make it impossible to scale to thousands of nodes, but I hope
>>> I am totally wrong about all of this, currently I am serving 1 millions pv
>>> every day with Cassandra and it make me feel unsafe, I am afraid one day one
>>> node crash will cause the data broken and all cluster goes wrong....
>>>
>>>
>>>
>>> in the contrary, relational database make me feel safety but it does not
>>> scale well.
>>>
>>>
>>>
>>> thanks for any guidance here.
>>>
>>>
>>>
>>
>>
>> --
>> Charles
>>
>
>
>


-- 
Charles

--0015174be78217105004a7b18cf6
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I missed the consistency level part, thanks very much for the explanation. =
that is clear enough.<br><br><div class=3D"gmail_quote">On Sun, Jul 10, 201=
1 at 7:57 AM, aaron morton <span dir=3D"ltr">&lt;<a href=3D"mailto:aaron@th=
elastpickle.com" target=3D"_blank">aaron@thelastpickle.com</a>&gt;</span> w=
rote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div style=3D"word-wrap:break-word"><div><bl=
ockquote type=3D"cite"><div><div><div>about the decommission problem, here =
is the link: =A0<a href=3D"http://cassandra-user-incubator-apache-org.30651=
46.n2.nabble.com/how-to-decommission-two-slow-nodes-td5078455.html" target=
=3D"_blank">http://cassandra-user-incubator-apache-org.3065146.n2.nabble.co=
m/how-to-decommission-two-slow-nodes-td5078455.html</a></div>


</div></div></blockquote></div><div>The key part of that post is &quot;and =
since the second node was under heavy load, and not enough ram, it was busy=
 GCing and worked horribly slow&quot; .=A0</div><div><br></div><div><div>


<blockquote type=3D"cite"><div><div><div>maybe I was misunderstanding the r=
eplication factor, doesn&#39;t it RF=3D3 means I could lose two nodes and s=
till have one available(with 100% of the keys), once Nodes&gt;=3D3?</div></=
div>


</div></blockquote></div>When you start losing replicas the CL you use dict=
ates if the cluster is still up for 100% of the keys. See=A0<a href=3D"http=
://thelastpickle.com/2011/06/13/Down-For-Me/" target=3D"_blank">http://thel=
astpickle.com/2011/06/13/Down-For-Me/</a>=A0</div>


<div><br></div><div><div><blockquote type=3D"cite"><div><div><div>=A0I have=
 the strong willing to set RF to a very high value...</div></div></div></bl=
ockquote></div>As chris said 3 is about normal, it means the QUORUM CL is o=
nly 2 nodes.=A0</div>


<div><div><br></div><div><blockquote type=3D"cite"><div><div><div class=3D"=
gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin-top:0px;marg=
in-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;bord=
er-left-color:rgb(204, 204, 204);border-left-style:solid;padding-left:1ex">


<div><div><div>I am also trying to deploy cassandra across two datacenters(=
with 20ms latency).</div></div></div></blockquote></div></div></div></block=
quote></div></div><div>Lookup LOCAL_QUORUM in the wiki</div><div><br></div>


<div>Hope that helps.=A0</div><div><div><br></div><div>
<span style=3D"border-collapse:separate;color:rgb(0, 0, 0);font-family:Helv=
etica;font-style:normal;font-variant:normal;font-weight:normal;letter-spaci=
ng:normal;line-height:normal;text-align:auto;text-indent:0px;text-transform=
:none;white-space:normal;word-spacing:0px;font-size:medium"><span style=3D"=
border-collapse:separate;color:rgb(0, 0, 0);font-family:Helvetica;font-styl=
e:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-=
height:normal;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px;font-size:medium"><div style=3D"word-wrap:break-word">


<span style=3D"border-collapse:separate;color:rgb(0, 0, 0);font-family:Helv=
etica;font-style:normal;font-variant:normal;font-weight:normal;letter-spaci=
ng:normal;line-height:normal;text-indent:0px;text-transform:none;white-spac=
e:normal;word-spacing:0px;font-size:medium"><div style=3D"word-wrap:break-w=
ord">


<div><div>-----------------</div><div>Aaron Morton</div><div>Freelance Cass=
andra Developer</div><div>@aaronmorton</div><div><a href=3D"http://www.thel=
astpickle.com" target=3D"_blank">http://www.thelastpickle.com</a></div></di=
v>


</div></span></div></span></span>
</div>

<br></div><div><div></div><div><div><div>On 9 Jul 2011, at 02:01, Chris Gof=
finet wrote:</div><br><blockquote type=3D"cite">As mentioned by Aaron, yes =
we run hundreds of Cassandra nodes across multiple clusters. We run with RF=
 of 2 and 3 (most common).=A0<div>


<br></div><div>We use commodity hardware and see failure all the time at th=
is scale. We&#39;ve never had 3 nodes that were in same replica set, fail a=
ll at once. We mitigate risk by being rack diverse, using different vendors=
 for our hard drives, designed workflows to make sure machines get serviced=
 in certain time windows and have an extensive automated burn-in process of=
 (disk, memory, drives) to not roll out nodes/clusters that could fail righ=
t away.<br>


<div><br><div class=3D"gmail_quote">On Sat, Jul 9, 2011 at 12:17 AM, Yan Ch=
unlu <span dir=3D"ltr">&lt;<a href=3D"mailto:springrider@gmail.com" target=
=3D"_blank">springrider@gmail.com</a>&gt;</span> wrote:<br><blockquote clas=
s=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;margin-bottom:0p=
x;margin-left:0.8ex;border-left-width:1px;border-left-color:rgb(204, 204, 2=
04);border-left-style:solid;padding-left:1ex">


thank you very much for the reply. which brings me more confidence on cassa=
ndra.<div><div>I will try the automation tools, the examples you&#39;ve lis=
ted seems quite promising!</div><div><br></div><div>
<br><div>about the decommission problem, here is the link: =A0<a href=3D"ht=
tp://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-decom=
mission-two-slow-nodes-td5078455.html" target=3D"_blank">http://cassandra-u=
ser-incubator-apache-org.3065146.n2.nabble.com/how-to-decommission-two-slow=
-nodes-td5078455.html</a></div>


<div>=A0I am also trying to deploy cassandra across two datacenters(with 20=
ms latency). so I am worrying about the network latency will even make it w=
orse. =A0</div><div><br></div><div>maybe I was misunderstanding the replica=
tion factor, doesn&#39;t it RF=3D3 means I could lose two nodes and still h=
ave one available(with 100% of the keys), once Nodes&gt;=3D3? =A0 besides I=
 am not sure what&#39;s twitters setting on RF, but it is possible to lose =
3 nodes in the same time(facebook once encountered photo loss because there=
 RAID broken, rarely happen though). I have the strong willing to set RF to=
 a very high value...</div>


<div><br></div><div>Thanks!</div><div><br></div><div><div><div></div><div><=
br><div class=3D"gmail_quote">On Sat, Jul 9, 2011 at 5:22 AM, aaron morton =
<span dir=3D"ltr">&lt;<a href=3D"mailto:aaron@thelastpickle.com" target=3D"=
_blank">aaron@thelastpickle.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div style=3D"word-wrap:break-word">AFAIK Fa=
cebook Cassandra and Apache Cassandra diverged paths a long time ago. Twitt=
er is a vocal supporter with a large Apache Cassandra install, e.g. &quot;T=
witter=A0currently runs a couple hundred Cassandra nodes across a half doze=
n clusters.=A0&quot;=A0<a href=3D"http://www.datastax.com/2011/06/chris-gof=
finet-of-twitter-to-speak-at-cassandra-sf-2011" target=3D"_blank">http://ww=
w.datastax.com/2011/06/chris-goffinet-of-twitter-to-speak-at-cassandra-sf-2=
011</a><div>


<br></div><div><br></div><div><a href=3D"http://www.datastax.com/2011/06/ch=
ris-goffinet-of-twitter-to-speak-at-cassandra-sf-2011" target=3D"_blank"></=
a>If you are working with a 3 node cluster removing/rebuilding/what ever on=
e node will effect 33% of your capacity. When you scale up the contribution=
 from each individual node goes down, and the impact of one node going down=
 is less. Problems that happen with a few nodes will go away at scale, to b=
e replaced by a whole set of new ones. =A0=A0<div>


<br><div><span style=3D"white-space:pre-wrap"><br></span></div><div><span s=
tyle=3D"white-space:pre-wrap"><blockquote type=3D"cite" style=3D"white-spac=
e:normal"><div><span style=3D"border-collapse:collapse;font-family:Verdana,=
 Geneva, Helvetica, Arial, sans-serif;font-size:13px">1): =A0the load balan=
ce need to manually performed on every node, according to:=A0<br>


</span></div></blockquote></span></div></div><div>Yes</div><div><span style=
=3D"white-space:pre-wrap">	</span></div><div><span style=3D"white-space:pre=
-wrap"></span><span style=3D"white-space:pre-wrap"><div><blockquote type=3D=
"cite" style=3D"white-space:normal">


<div><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva, H=
elvetica, Arial, sans-serif;font-size:13px">2): when adding new nodes, need=
 to perform node repair and cleanup on every node=A0<br></span></div></bloc=
kquote>


</div><div><span style=3D"white-space:pre-wrap">You only need to run cleanu=
p, see </span><a href=3D"http://wiki.apache.org/cassandra/Operations#Bootst=
rap" target=3D"_blank">http://wiki.apache.org/cassandra/Operations#Bootstra=
p</a></div>


<div><div><span style=3D"white-space:pre-wrap"><br></span></div><div><span =
style=3D"white-space:pre-wrap"><blockquote type=3D"cite" style=3D"white-spa=
ce:normal"><div><span style=3D"border-collapse:collapse;font-family:Verdana=
, Geneva, Helvetica, Arial, sans-serif;font-size:13px">3) when decommission=
 a node, there is a chance that slow down the entire cluster. (not sure why=
 but I saw people ask around about it.) and the only way to do is shutdown =
the entire the cluster, rsync the data, and start all nodes without the dec=
ommission one.=A0</span></div>


</blockquote></span></div></div><div><span style=3D"white-space:pre-wrap">I=
 cannot remember any specific cases where decommission requires a full clus=
ter stop, do you have a link? With regard to slowing down, the decommission=
 process will stream data from the node you are removing onto the other nod=
es this can slow down the target node (I think it&#39;s more intelligent no=
w about what is moved). This will be exaggerated in a 3 node cluster as you=
 are removing 33% of the processing and adding some (temporary) extra load =
to the remaining nodes. </span></div>


<div><div><span style=3D"white-space:pre-wrap"><br></span></div><blockquote=
 type=3D"cite" style=3D"white-space:normal"><div><span style=3D"border-coll=
apse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-serif;fon=
t-size:13px">after all, I think there is alot of human work to do to mainta=
in the cluster which make it impossible to scale to thousands of nodes,=A0<=
/span></div>


</blockquote></div><div>Automation, Automation, Automation is the only way =
to go. </div><div><br></div><div>Chef, Puppet, CF Engine for general config=
 and deployment; Cloud Kick, munin, ganglia etc for monitoring. And </div>


<div><span style=3D"white-space:pre-wrap">Ops Centre (<a href=3D"http://www=
.datastax.com/products/opscenter" target=3D"_blank">http://www.datastax.com=
/products/opscenter</a>) for cassandra specific management.</span></div><di=
v>


<div><span style=3D"white-space:pre-wrap"><br></span></div><blockquote type=
=3D"cite" style=3D"white-space:normal"><div><span style=3D"border-collapse:=
collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-serif;font-siz=
e:13px">I am totally wrong about all of this, currently I am serving 1 mill=
ions pv every day with Cassandra and it make me feel unsafe, I am afraid on=
e day one node crash will cause the data broken and all cluster goes wrong.=
...</span></div>


</blockquote></div><div>With RF3 and a 3Node cluster you have room to lose =
one node and the cluster will be up for 100% of the keys. While better than=
 having to worry about *the* database server, it&#39;s still entry level fa=
ult tolerance. With RF 3 in a 6 Node cluster you can lose up to 2 nodes and=
 still be up for 100% of the keys. </div>


<div><br></div><div>Is there something you are specifically concerned about=
 with your current installation ? </div><div><span style=3D"white-space:pre=
-wrap"><br></span></div>Cheers</span></div><div><span style=3D"white-space:=
pre-wrap"><br>


</span><div>
<span style=3D"border-collapse:separate;color:rgb(0, 0, 0);font-family:Helv=
etica;font-style:normal;font-variant:normal;font-weight:normal;letter-spaci=
ng:normal;line-height:normal;text-align:auto;text-indent:0px;text-transform=
:none;white-space:normal;word-spacing:0px;font-size:medium"><span style=3D"=
border-collapse:separate;color:rgb(0, 0, 0);font-family:Helvetica;font-styl=
e:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-=
height:normal;text-indent:0px;text-transform:none;white-space:normal;word-s=
pacing:0px;font-size:medium"><div style=3D"word-wrap:break-word">


<span style=3D"border-collapse:separate;color:rgb(0, 0, 0);font-family:Helv=
etica;font-style:normal;font-variant:normal;font-weight:normal;letter-spaci=
ng:normal;line-height:normal;text-indent:0px;text-transform:none;white-spac=
e:normal;word-spacing:0px;font-size:medium"><div style=3D"word-wrap:break-w=
ord">


<div><div>-----------------</div><div>Aaron Morton</div><div>Freelance Cass=
andra Developer</div><div>@aaronmorton</div><div><a href=3D"http://www.thel=
astpickle.com/" target=3D"_blank">http://www.thelastpickle.com</a></div></d=
iv>


</div></span></div></span></span>
</div><div><div></div><div>

<br><div><div>On 8 Jul 2011, at 08:50, Yan Chunlu wrote:</div><br><blockquo=
te type=3D"cite">hi, all:
<div><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva, H=
elvetica, Arial, sans-serif;font-size:13px">I am curious about how large th=
at Cassandra can scale?=A0</span><span style=3D"border-collapse:collapse;fo=
nt-family:Verdana, Geneva, Helvetica, Arial, sans-serif;font-size:13px"><br=
>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px">

</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">from the information I can get, the largest usage is a=
t facebook, which is about 150 nodes. =A0in the mean time they are using 20=
00+ nodes with Hadoop, and yahoo even using 4000 nodes of Hadoop.=A0</span>=
<span style=3D"border-collapse:collapse;font-family:Verdana, Geneva, Helvet=
ica, Arial, sans-serif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">I am not understand why is the situation, I only have =
=A0little knowledge with Cassandra and even no knowledge with Hadoop.=A0</s=
pan><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva, He=
lvetica, Arial, sans-serif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px">

</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">currently I am using cassandra with 3 nodes and having=
 problem bring one back after it out of sync, the problems I encountered ma=
king me worry about how cassandra could scale out:=A0</span><span style=3D"=
border-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, san=
s-serif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">1): =A0the load balance need to manually performed on =
every node, according to:=A0</span><span style=3D"border-collapse:collapse;=
font-family:Verdana, Geneva, Helvetica, Arial, sans-serif;font-size:13px"><=
br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px">

</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">def tokens(nodes):=A0</span><span style=3D"border-coll=
apse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-serif;fon=
t-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">for x in xrange(nodes):=A0</span><span style=3D"border=
-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-seri=
f;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">print 2 ** 127 / nodes * x=A0</span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">2): when adding new nodes, need to perform node repair=
 and cleanup on every node=A0</span><span style=3D"border-collapse:collapse=
;font-family:Verdana, Geneva, Helvetica, Arial, sans-serif;font-size:13px">=
<br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">3) when decommission a node, there is a chance that sl=
ow down the entire cluster. (not sure why but I saw people ask around about=
 it.) and the only way to do is shutdown the entire the cluster, rsync the =
data, and start all nodes without the decommission one.=A0</span><span styl=
e=3D"border-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial=
, sans-serif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px">

</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">after all, I think there is alot of human work to do t=
o maintain the cluster which make it impossible to scale to thousands of no=
des, but I hope I am totally wrong about all of this, currently I am servin=
g 1 millions pv every day with Cassandra and it make me feel unsafe, I am a=
fraid one day one node crash will cause the data broken and all cluster goe=
s wrong....=A0</span><span style=3D"border-collapse:collapse;font-family:Ve=
rdana, Geneva, Helvetica, Arial, sans-serif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px">

</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">in the contrary, relational database make me feel safe=
ty but it does not scale well.=A0</span><span style=3D"border-collapse:coll=
apse;font-family:Verdana, Geneva, Helvetica, Arial, sans-serif;font-size:13=
px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px"><br>


</span><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva,=
 Helvetica, Arial, sans-serif;font-size:13px"><br></span><span style=3D"bor=
der-collapse:collapse;font-family:Verdana, Geneva, Helvetica, Arial, sans-s=
erif;font-size:13px">thanks for any guidance here.</span></div>


<div><span style=3D"border-collapse:collapse;font-family:Verdana, Geneva, H=
elvetica, Arial, sans-serif;font-size:13px"><br>

</span></div>
</blockquote></div><br></div></div></div></div></div></blockquote></div><br=
><br clear=3D"all"><br></div></div><font color=3D"#888888">-- <br>Charles</=
font></div></div></div>
</blockquote></div><br></div></div>
</blockquote></div><br></div></div></div></blockquote></div><br><br clear=
=3D"all"><br>-- <br>Charles

--0015174be78217105004a7b18cf6--