Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EB6D91127A for ; Sun, 8 Jun 2014 02:14:55 +0000 (UTC) Received: (qmail 21298 invoked by uid 500); 8 Jun 2014 02:14:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 21259 invoked by uid 500); 8 Jun 2014 02:14:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21251 invoked by uid 99); 8 Jun 2014 02:14:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Jun 2014 02:14:52 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [209.85.192.47] (HELO mail-qg0-f47.google.com) (209.85.192.47) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Jun 2014 02:14:47 +0000 Received: by mail-qg0-f47.google.com with SMTP id j107so7331886qga.34 for ; Sat, 07 Jun 2014 19:13:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:references:from:in-reply-to:mime-version:date :message-id:subject:to:content-type; bh=BctDzi/bfEDRjKWh1g1GqVyzDfpumSaWTiVGN6tayEE=; b=SFLC6HsfME/Zr986G57gQQqqD4gRBTwovFY90RAT7VnTxvARGVGckjCNNPnc1+GOTu aTwYk8y4sugIxvGIcousmLRYvffAcLv6/WtAp8aqzJxGBXMBh7r6jOg7fqPWZOeazVxw 1M3FE++7L7Ot5+4Bv5Ti/B9o9u9VOI6K80GNSt3DEW1gArtYD8m/Iz5OhetMl4VVpZzw 3LilWbsblPlp5oi7P38ruIgy2EMkGd01Yn11qrxGS0oSw2UXNxmOka6/x+I1AFABBz0R /92ojztFjr6kWoeBzuYgmPYSTJjGk8aAxJhvQkcEHswx58ECLBcTpVVAenIqZ6AQQboZ V6Vg== X-Gm-Message-State: ALoCoQnoQEgOEYn20vDPJTyzXHD8uuSbrK7sU71KG9TYb48CGR9e0QK4rvlVO53b5joFjvQVb3ct X-Received: by 10.224.136.65 with SMTP id q1mr22051656qat.93.1402193623157; Sat, 07 Jun 2014 19:13:43 -0700 (PDT) References: <81A38010-3844-47F7-8DF6-5F069D12AD7C@gmail.com> <-6327655872995295763@unknownmsgid> <-6478849670086987286@unknownmsgid> From: Colin Clark In-Reply-To: Mime-Version: 1.0 (1.0) Date: Sat, 7 Jun 2014 21:13:40 -0500 Message-ID: <2365536939751282242@unknownmsgid> Subject: Re: Data model for streaming a large table in real time. To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=001a11c2b810d0c19404fb49a426 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2b810d0c19404fb49a426 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Not if you add another column to the partition key; source for example. I would really try to stay away from the ordered partitioner if at all possible. What ingestion rates are you expecting, in size and speed. -- Colin 320-221-9531 On Jun 7, 2014, at 9:05 PM, Kevin Burton wrote: Thanks for the feedback on this btw.. .it's helpful. My notes below. On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark wrote: > No, you're not-the partition key will get distributed across the cluster > if you're using random or murmur. > Yes=E2=80=A6 I'm aware. But in practice this is how it will work=E2=80=A6 If we create bucket b0, that will get hashed to h0=E2=80=A6 So say I have 50 machines performing writes, they are all on the same time thanks to ntpd, so they all compute b0 for the current bucket based on the time. That gets hashed to h0=E2=80=A6 If h0 is hosted on node0 =E2=80=A6 then all writes go to node zero for that= 1 second interval. So all my writes are bottlenecking on one node. That node is *changing* over time=E2=80=A6 but they're not being dispatched in parallel over N node= s. At most writes will only ever reach 1 node a time. > You could also ensure that by adding another column, like source to ensur= e > distribution. (Add the seconds to the partition key, not the clustering > columns) > > I can almost guarantee that if you put too much thought into working > against what Cassandra offers out of the box, that it will bite you later= . > > Sure.. I'm trying to avoid the 'bite you later' issues. More so because I'm sure there are Cassandra gotchas to worry about. Everything has them. Just trying to avoid the land mines :-P > In fact, the use case that you're describing may best be served by a > queuing mechanism, and using Cassandra only for the underlying store. > Yes=E2=80=A6 that's what I'm doing. We're using apollo to fan out the queu= e, but the writes go back into cassandra and needs to be read out sequentially. > > I used this exact same approach in a use case that involved writing over = a > million events/second to a cluster with no problems. Initially, I though= t > ordered partitioner was the way to go too. And I used separate processes > to aggregate, conflate, and handle distribution to clients. > Yes. I think using 100 buckets will work for now. Plus I don't have to change the partitioner on our existing cluster and I'm lazy :) > > Just my two cents, but I also spend the majority of my days helping peopl= e > utilize Cassandra correctly, and rescuing those that haven't. > > Definitely appreciate the feedback! Thanks! --=20 Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com =E2=80=A6 or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. --001a11c2b810d0c19404fb49a426 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Not if you add another column to t= he partition key; source for example. =C2=A0

I wou= ld really try to stay away from the ordered partitioner if at all possible.=

What ingestion rates are you expecting, in size and spe= ed.

--
Colin
320-221-9531


On Jun 7, 2014, at 9:05 PM, Kevin Burton <burton@spinn3r.com> wrote:


Thanks for the feedback on t= his btw.. .it's helpful. =C2=A0My notes below.

On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark <colin= @clark.ws> wrote:
No, you're not-th= e partition key will get distributed across the cluster if you're using= random or murmur.

Yes=E2=80=A6 I'm aware. =C2=A0Bu= t in practice this is how it will work=E2=80=A6

If= we create bucket b0, that will get hashed to h0=E2=80=A6

So say I have 50 machines performing writes, they are all on the sa= me time thanks to ntpd, so they all compute b0 for the current bucket based= on the time.

That gets hashed to h0=E2=80=A6

If h0 is hosted on node0 =E2=80=A6 then all writes go to node zero for th= at 1 second interval.

So all my writes are bottlen= ecking on one node. =C2=A0That node is *changing* over time=E2=80=A6 but th= ey're not being dispatched in parallel over N nodes. =C2=A0At most writ= es will only ever reach 1 node a time.

=C2=A0
You could also ensure that by adding another column, like source= to ensure distribution. (Add the seconds to the partition key, not the clu= stering columns)

I can almost guarantee that if you put too much thought= into working against what Cassandra offers out of the box, that it will bi= te you later.


Sure.. I'm trying to avoid the 'bite you later' issues. More so= because I'm sure there are Cassandra gotchas to worry about. =C2=A0Eve= rything has them. =C2=A0Just trying to avoid the land mines :-P
= =C2=A0
In fact, t= he use case that you're describing may best be served by a queuing mech= anism, and using Cassandra only for the underlying store.

Yes=E2=80=A6 that's what I'm= doing. =C2=A0We're using apollo to fan out the queue, but the writes g= o back into cassandra and needs to be read out sequentially.
=C2= =A0

I used this exact same approach in a use case that invo= lved writing over a million events/second to a cluster with no problems. = =C2=A0Initially, I thought ordered partitioner was the way to go too. =C2= =A0And I used separate processes to aggregate, conflate, and handle distrib= ution to clients.


Yes. I think using 10= 0 buckets will work for now. =C2=A0Plus I don't have to change the part= itioner on our existing cluster and I'm lazy :)
=C2=A0
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">

Just my two cents, but I also spend the majority of my = days helping people utilize Cassandra correctly, and rescuing those that ha= ven't.


Definitely appreciate the feedback! =C2=A0Thanks!
=C2=A0
--

Founder/CEO=C2=A0Spinn3r.com
Location:=C2=A0San Francisco, CA
Skype:=C2=A0burton= ator
=E2=80=A6 or check out my Google+ profile
War is peace. Fre= edom is slavery. Ignorance is strength. Corporations are people.

--001a11c2b810d0c19404fb49a426--