Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 249D5112A5 for ; Sun, 8 Jun 2014 02:38:47 +0000 (UTC) Received: (qmail 37871 invoked by uid 500); 8 Jun 2014 02:38:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 37832 invoked by uid 500); 8 Jun 2014 02:38:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 37824 invoked by uid 99); 8 Jun 2014 02:38:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Jun 2014 02:38:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.216.45] (HELO mail-qa0-f45.google.com) (209.85.216.45) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Jun 2014 02:38:42 +0000 Received: by mail-qa0-f45.google.com with SMTP id hw13so6399516qab.32 for ; Sat, 07 Jun 2014 19:38:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:references:from:in-reply-to:mime-version:date :message-id:subject:to:content-type; bh=Gzfh22F4f9b8F3I8JkapXLZSYbpZBl/EmLhxgR4XcCg=; b=Tdh9KT5euDJETvqwhw8r6yYObQsFFKHHXsUhvJBPcuaWRS5zwcEfWm3Fs4e5J4ngCs n4a44TCRfGwFI/M7QMbPm/Lm4rYvtW3FliBNdt8eRsrDgMWLuM7neHOEXhSgc2QBhiJD N5/1qTCS0RP9fSRH1OSMq5TJfRUnfrfU6MpM1ffjJOgiR8hsFlwzETA7CtYHerN6WrCy KqeywgFe/KgmpDSt8QImRDD9RDi91GF7Iw2EbtOeAkE0AmYBtjOC7vCZgnj/tfC3hr3Y 82zgCmYGBp8aWFWObeerx+nM/n0dGHRzeGZJ+q0zGHdPYYhRZ0+gx7aULt0H7CTsDs6M RkeQ== X-Gm-Message-State: ALoCoQkd7J7cN6Iao4yVIIxDUdt5HTg19dscab0JPdL8yMO8qTA52+XM8ClPNHhlTiMzBOOXWpo3 X-Received: by 10.224.161.83 with SMTP id q19mr22060848qax.56.1402195097835; Sat, 07 Jun 2014 19:38:17 -0700 (PDT) References: <81A38010-3844-47F7-8DF6-5F069D12AD7C@gmail.com> <-6327655872995295763@unknownmsgid> <-6478849670086987286@unknownmsgid> <2365536939751282242@unknownmsgid> From: Colin Clark In-Reply-To: Mime-Version: 1.0 (1.0) Date: Sat, 7 Jun 2014 21:38:12 -0500 Message-ID: <-708492596209370951@unknownmsgid> Subject: Re: Data model for streaming a large table in real time. To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=089e0149ccdeb688b204fb49fc5d X-Virus-Checked: Checked by ClamAV on apache.org --089e0149ccdeb688b204fb49fc5d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable With 100 nodes, that ingestion rate is actually quite low and I don't think you'd need another column in the partition key. You seem to be set in your current direction. Let us know how it works out= . -- Colin 320-221-9531 On Jun 7, 2014, at 9:18 PM, Kevin Burton wrote: What's 'source' ? You mean like the URL? If source too random it's going to yield too many buckets. Ingestion rates are fairly high but not insane. About 4M inserts per hour.. from 5-10GB=E2=80=A6 On Sat, Jun 7, 2014 at 7:13 PM, Colin Clark wrote: > Not if you add another column to the partition key; source for example. > > I would really try to stay away from the ordered partitioner if at all > possible. > > What ingestion rates are you expecting, in size and speed. > > -- > Colin > 320-221-9531 > > > On Jun 7, 2014, at 9:05 PM, Kevin Burton wrote: > > > Thanks for the feedback on this btw.. .it's helpful. My notes below. > > On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark wrote: > >> No, you're not-the partition key will get distributed across the cluster >> if you're using random or murmur. >> > > Yes=E2=80=A6 I'm aware. But in practice this is how it will work=E2=80= =A6 > > If we create bucket b0, that will get hashed to h0=E2=80=A6 > > So say I have 50 machines performing writes, they are all on the same tim= e > thanks to ntpd, so they all compute b0 for the current bucket based on th= e > time. > > That gets hashed to h0=E2=80=A6 > > If h0 is hosted on node0 =E2=80=A6 then all writes go to node zero for th= at 1 > second interval. > > So all my writes are bottlenecking on one node. That node is *changing* > over time=E2=80=A6 but they're not being dispatched in parallel over N no= des. At > most writes will only ever reach 1 node a time. > > > >> You could also ensure that by adding another column, like source to >> ensure distribution. (Add the seconds to the partition key, not the >> clustering columns) >> >> I can almost guarantee that if you put too much thought into working >> against what Cassandra offers out of the box, that it will bite you late= r. >> >> > Sure.. I'm trying to avoid the 'bite you later' issues. More so because > I'm sure there are Cassandra gotchas to worry about. Everything has them= . > Just trying to avoid the land mines :-P > > >> In fact, the use case that you're describing may best be served by a >> queuing mechanism, and using Cassandra only for the underlying store. >> > > Yes=E2=80=A6 that's what I'm doing. We're using apollo to fan out the qu= eue, but > the writes go back into cassandra and needs to be read out sequentially. > > >> >> I used this exact same approach in a use case that involved writing over >> a million events/second to a cluster with no problems. Initially, I >> thought ordered partitioner was the way to go too. And I used separate >> processes to aggregate, conflate, and handle distribution to clients. >> > > > Yes. I think using 100 buckets will work for now. Plus I don't have to > change the partitioner on our existing cluster and I'm lazy :) > > >> >> Just my two cents, but I also spend the majority of my days helping >> people utilize Cassandra correctly, and rescuing those that haven't. >> >> > Definitely appreciate the feedback! Thanks! > > -- > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > Skype: *burtonator* > blog: http://burtonator.wordpress.com > =E2=80=A6 or check out my Google+ profile > > > War is peace. Freedom is slavery. Ignorance is strength. Corporations are > people. > > --=20 Founder/CEO Spinn3r.com Location: *San Francisco, CA* Skype: *burtonator* blog: http://burtonator.wordpress.com =E2=80=A6 or check out my Google+ profile War is peace. Freedom is slavery. Ignorance is strength. Corporations are people. --089e0149ccdeb688b204fb49fc5d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
With 100 nodes, that ingestion rat= e is actually quite low and I don't think you'd need another column= in the partition key.

You seem to be set in your current direction. =C2=A0Let= us know how it works out.

--
Colin
320-221-953= 1


On Jun 7, 2014, at 9:18 PM, Kevin Burt= on <burton@spinn3r.com> wro= te:

What's 's= ource' ? You mean like the URL?

If source too random= it's going to yield too many buckets. =C2=A0

In= gestion rates are fairly high but not insane. =C2=A0About 4M inserts per ho= ur.. from 5-10GB=E2=80=A6=C2=A0


O= n Sat, Jun 7, 2014 at 7:13 PM, Colin Clark <colin@clark.ws> wro= te:
Not if you add anothe= r column to the partition key; source for example. =C2=A0

I would really try to stay away from the ordered partitioner if at all poss= ible.

What ingestion rates are you expecting, in size and spe= ed.

--
Colin


On Jun 7, 2014, at 9:05 PM, Kevin Burton <burton@spinn3r.com= > wrote:


Thanks for the feedback on t= his btw.. .it's helpful. =C2=A0My notes below.

On Sat, Jun 7, 2014 at 5:14 PM, Colin Clark <colin= @clark.ws> wrote:
No, you're not-th= e partition key will get distributed across the cluster if you're using= random or murmur.

Yes=E2=80=A6 I'm aware. =C2=A0Bu= t in practice this is how it will work=E2=80=A6

If= we create bucket b0, that will get hashed to h0=E2=80=A6

So say I have 50 machines performing writes, they are all on the sa= me time thanks to ntpd, so they all compute b0 for the current bucket based= on the time.

That gets hashed to h0=E2=80=A6

If h0 is hosted on node0 =E2=80=A6 then all writes go to node zero for th= at 1 second interval.

So all my writes are bottlen= ecking on one node. =C2=A0That node is *changing* over time=E2=80=A6 but th= ey're not being dispatched in parallel over N nodes. =C2=A0At most writ= es will only ever reach 1 node a time.

=C2=A0
You could also ensure that by adding another column, like source= to ensure distribution. (Add the seconds to the partition key, not the clu= stering columns)

I can almost guarantee that if you put too much thought= into working against what Cassandra offers out of the box, that it will bi= te you later.


Sure.. I'm trying to avoid the 'bite you later' issues. More so= because I'm sure there are Cassandra gotchas to worry about. =C2=A0Eve= rything has them. =C2=A0Just trying to avoid the land mines :-P
= =C2=A0
In fact, t= he use case that you're describing may best be served by a queuing mech= anism, and using Cassandra only for the underlying store.

Yes=E2=80=A6 that's what I'm= doing. =C2=A0We're using apollo to fan out the queue, but the writes g= o back into cassandra and needs to be read out sequentially.
=C2= =A0

I used this exact same approach in a use case that invo= lved writing over a million events/second to a cluster with no problems. = =C2=A0Initially, I thought ordered partitioner was the way to go too. =C2= =A0And I used separate processes to aggregate, conflate, and handle distrib= ution to clients.


Yes. I think using 10= 0 buckets will work for now. =C2=A0Plus I don't have to change the part= itioner on our existing cluster and I'm lazy :)
=C2=A0
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">

Just my two cents, but I also spend the majority of my = days helping people utilize Cassandra correctly, and rescuing those that ha= ven't.


Definitely appreciate the feedback! =C2=A0Thanks!
=C2=A0
--

Founder/CEO=C2=A0Spinn3r.com
Location:=C2=A0San Francisco, CA
Skype:=C2=A0burton= ator
=E2=80=A6 or check out my Google+ profile
War is peace. Fre= edom is slavery. Ignorance is strength. Corporations are people.




--

Founder/CEO=C2=A0Spinn3r.com
Location:=C2=A0San Francisco, CA
Skype:=C2=A0burton= ator
=E2=80=A6 or check out my Google+ profile
War is peace. Fre= edom is slavery. Ignorance is strength. Corporations are people.

--089e0149ccdeb688b204fb49fc5d--