Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: error (nike.apache.org: local policy)
References: 
 <CAAZU44kSAwFQVHedh6hTE8pV19L=4_ROjs1pr9hT2PxZNp=tUg@mail.gmail.com>
 <81A38010-3844-47F7-8DF6-5F069D12AD7C@gmail.com>
 <CAAZU44nYNNXjm+gOa1T-5JGc2m2F7-U5H5QPHhNfqjRnJ-RyOw@mail.gmail.com>
 <-6327655872995295763@unknownmsgid>
 <CAAZU44kJ0VSccBqp7YyMmsyWmkK-jH4GEKDkVitvnzzzcW=KEQ@mail.gmail.com>
 <CAAZU44=FSyx27eJphEvJu=8tKP9uDU8c9fUy7hn0ZP0xCU3SDw@mail.gmail.com>
 <E4DE3713-538A-4F7D-BD93-1D5D14C755CA@gmail.com>
 <CAAZU44=UBOr8UPoCanaSQYRgP=wx3YJtJcFS2pKkMGrLgPFX8Q@mail.gmail.com>
 <DA8C70A8-1CED-46E8-8C5F-2891A9820612@gmail.com>
 <CAAZU44n=9VzAjzRs=_iYj6UiBkxfUzP6YTAgo3j6fnD-uSnh+w@mail.gmail.com>
 <-6478849670086987286@unknownmsgid>
 <CAAZU44nEHFu=c2RMFfcgXbXoyKKoL996hFxX79BetVwk6X0uog@mail.gmail.com>
From: Colin Clark <colin@clark.ws>
In-Reply-To: 
 <CAAZU44nEHFu=c2RMFfcgXbXoyKKoL996hFxX79BetVwk6X0uog@mail.gmail.com>
Mime-Version: 1.0 (1.0)
Date: Sat, 7 Jun 2014 21:13:40 -0500
Message-ID: <2365536939751282242@unknownmsgid>
Subject: Re: Data model for streaming a large table in real time.
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=001a11c2b810d0c19404fb49a426

--001a11c2b810d0c19404fb49a426
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Not if you add another column to the partition key; source for example.

I would really try to stay away from the ordered partitioner if at all
possible.

What ingestion rates are you expecting, in size and speed.

--
Colin
320-221-9531


On Jun 7, 2014, at 9:05 PM, Kevin Burton <burton@spinn3r.com> wrote:


Thanks for the feedback on this btw.. .it's helpful.  My notes below.

On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark <colin@clark.ws> wrote:

> No, you're not-the partition key will get distributed across the cluster
> if you're using random or murmur.
>

Yes=E2=80=A6 I'm aware.  But in practice this is how it will work=E2=80=A6

If we create bucket b0, that will get hashed to h0=E2=80=A6

So say I have 50 machines performing writes, they are all on the same time
thanks to ntpd, so they all compute b0 for the current bucket based on the
time.

That gets hashed to h0=E2=80=A6

If h0 is hosted on node0 =E2=80=A6 then all writes go to node zero for that=
 1
second interval.

So all my writes are bottlenecking on one node.  That node is *changing*
over time=E2=80=A6 but they're not being dispatched in parallel over N node=
s.  At
most writes will only ever reach 1 node a time.


> You could also ensure that by adding another column, like source to ensur=
e
> distribution. (Add the seconds to the partition key, not the clustering
> columns)
>
> I can almost guarantee that if you put too much thought into working
> against what Cassandra offers out of the box, that it will bite you later=
.
>
>
Sure.. I'm trying to avoid the 'bite you later' issues. More so because I'm
sure there are Cassandra gotchas to worry about.  Everything has them.
 Just trying to avoid the land mines :-P


> In fact, the use case that you're describing may best be served by a
> queuing mechanism, and using Cassandra only for the underlying store.
>

Yes=E2=80=A6 that's what I'm doing.  We're using apollo to fan out the queu=
e, but
the writes go back into cassandra and needs to be read out sequentially.


>
> I used this exact same approach in a use case that involved writing over =
a
> million events/second to a cluster with no problems.  Initially, I though=
t
> ordered partitioner was the way to go too.  And I used separate processes
> to aggregate, conflate, and handle distribution to clients.
>


Yes. I think using 100 buckets will work for now.  Plus I don't have to
change the partitioner on our existing cluster and I'm lazy :)


>
> Just my two cents, but I also spend the majority of my days helping peopl=
e
> utilize Cassandra correctly, and rescuing those that haven't.
>
>
Definitely appreciate the feedback!  Thanks!

--=20

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
Skype: *burtonator*
blog: http://burtonator.wordpress.com
=E2=80=A6 or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>
War is peace. Freedom is slavery. Ignorance is strength. Corporations are
people.

--001a11c2b810d0c19404fb49a426
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<html><head><meta http-equiv=3D"content-type" content=3D"text/html; charset=
=3Dutf-8"></head><body dir=3D"auto"><div>Not if you add another column to t=
he partition key; source for example. =C2=A0</div><div><br></div><div>I wou=
ld really try to stay away from the ordered partitioner if at all possible.=
</div>
<div><br></div><div>What ingestion rates are you expecting, in size and spe=
ed.<br><br>--<div>Colin</div><div>320-221-9531</div><div><br></div></div><d=
iv><br>On Jun 7, 2014, at 9:05 PM, Kevin Burton &lt;<a href=3D"mailto:burto=
n@spinn3r.com">burton@spinn3r.com</a>&gt; wrote:<br>
<br></div><blockquote type=3D"cite"><div><div dir=3D"ltr"><div class=3D"gma=
il_extra"><br></div><div class=3D"gmail_extra">Thanks for the feedback on t=
his btw.. .it&#39;s helpful. =C2=A0My notes below.</div><div class=3D"gmail=
_extra"><br>
<div class=3D"gmail_quote">On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark <spa=
n dir=3D"ltr">&lt;<a href=3D"mailto:colin@clark.ws" target=3D"_blank">colin=
@clark.ws</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"auto"><div>No, you&#39;re not-th=
e partition key will get distributed across the cluster if you&#39;re using=
 random or murmur. </div>


</div></blockquote><div><br></div><div>Yes=E2=80=A6 I&#39;m aware. =C2=A0Bu=
t in practice this is how it will work=E2=80=A6</div><div><br></div><div>If=
 we create bucket b0, that will get hashed to h0=E2=80=A6</div><div><br></d=
iv><div>So say I have 50 machines performing writes, they are all on the sa=
me time thanks to ntpd, so they all compute b0 for the current bucket based=
 on the time.</div>


<div><br></div><div>That gets hashed to h0=E2=80=A6</div><div><br></div><di=
v>If h0 is hosted on node0 =E2=80=A6 then all writes go to node zero for th=
at 1 second interval.</div><div><br></div><div>So all my writes are bottlen=
ecking on one node. =C2=A0That node is *changing* over time=E2=80=A6 but th=
ey&#39;re not being dispatched in parallel over N nodes. =C2=A0At most writ=
es will only ever reach 1 node a time.</div>


<div><br></div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"=
margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"=
auto"><div>You could also ensure that by adding another column, like source=
 to ensure distribution. (Add the seconds to the partition key, not the clu=
stering columns)</div>


<div><br></div><div>I can almost guarantee that if you put too much thought=
 into working against what Cassandra offers out of the box, that it will bi=
te you later.</div><div><br></div></div></blockquote><div><br></div><div>


Sure.. I&#39;m trying to avoid the &#39;bite you later&#39; issues. More so=
 because I&#39;m sure there are Cassandra gotchas to worry about. =C2=A0Eve=
rything has them. =C2=A0Just trying to avoid the land mines :-P</div><div>=
=C2=A0</div>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"auto"><div></div><div>In fact, t=
he use case that you&#39;re describing may best be served by a queuing mech=
anism, and using Cassandra only for the underlying store.</div>


</div></blockquote><div><br></div><div>Yes=E2=80=A6 that&#39;s what I&#39;m=
 doing. =C2=A0We&#39;re using apollo to fan out the queue, but the writes g=
o back into cassandra and needs to be read out sequentially.</div><div>=C2=
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">


<div dir=3D"auto">
<div><br></div><div>I used this exact same approach in a use case that invo=
lved writing over a million events/second to a cluster with no problems. =
=C2=A0Initially, I thought ordered partitioner was the way to go too. =C2=
=A0And I used separate processes to aggregate, conflate, and handle distrib=
ution to clients.</div>


</div></blockquote><div><br></div><div><br></div><div>Yes. I think using 10=
0 buckets will work for now. =C2=A0Plus I don&#39;t have to change the part=
itioner on our existing cluster and I&#39;m lazy :)</div><div>=C2=A0</div><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex">


<div dir=3D"auto">
<div><br></div><div>Just my two cents, but I also spend the majority of my =
days helping people utilize Cassandra correctly, and rescuing those that ha=
ven&#39;t.</div><div><br></div></div></blockquote><div><br></div><div>


Definitely appreciate the feedback! =C2=A0Thanks!</div><div>=C2=A0</div></d=
iv>-- <br><div><div><p style=3D"margin-top:0px;margin-right:0px;margin-bott=
om:12pt;margin-left:0px"></p><div>Founder/CEO=C2=A0<a href=3D"http://Spinn3=
r.com" target=3D"_blank">Spinn3r.com</a><br>


</div><div>Location:=C2=A0<b>San Francisco, CA</b><br>Skype:=C2=A0<b>burton=
ator</b></div><div><font color=3D"#2c2c2c" face=3D"Helvetica, Arial, sans-s=
erif"><span style=3D"line-height:19px">blog:<b>=C2=A0</b></span></font><a h=
ref=3D"http://burtonator.wordpress.com" target=3D"_blank">http://burtonator=
.wordpress.com</a></div>


<div>=E2=80=A6 or check out my <a href=3D"https://plus.google.com/102718274=
791889610666/posts" target=3D"_blank">Google+ profile</a></div><div><a href=
=3D"http://spinn3r.com" target=3D"_blank"><img src=3D"http://spinn3r.com/im=
ages/spinn3r.jpg"></a></div>


<div><span style=3D"color:rgb(0,0,0);font-family:verdana,arial,helvetica,sa=
ns-serif;font-size:small;font-style:normal;font-variant:normal;font-weight:=
normal;letter-spacing:normal;line-height:normal;text-align:start;text-inden=
t:0px;text-transform:none;white-space:normal;word-spacing:0px;background-co=
lor:rgb(255,255,255);display:inline!important;float:none">War is peace. Fre=
edom is slavery. Ignorance is strength. Corporations are people.</span></di=
v>


<p></p></div></div>
</div></div>
</div></blockquote></body></html>

--001a11c2b810d0c19404fb49a426--