Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9FDCC10FB3 for ; Sat, 7 Mar 2015 22:20:34 +0000 (UTC) Received: (qmail 23054 invoked by uid 500); 7 Mar 2015 22:20:28 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 23003 invoked by uid 500); 7 Mar 2015 22:20:28 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 22993 invoked by uid 99); 7 Mar 2015 22:20:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Mar 2015 22:20:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mightye@gmail.com designates 209.85.213.181 as permitted sender) Received: from [209.85.213.181] (HELO mail-ig0-f181.google.com) (209.85.213.181) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Mar 2015 22:20:23 +0000 Received: by igbhn18 with SMTP id hn18so12437084igb.2 for ; Sat, 07 Mar 2015 14:20:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=7rlPFfvKxG7p0I01x0C9DlOBbdCbAxcT6gwSxnXoYH0=; b=o684j3/sSfCLokhHAdjM4sEo/zWaAM6CNe3ZZNQYf/izn12Rda6krMmBKYDibfzDsp tBJJ3gStjTeZlwU+vSi3T2e1O5Audlprbnx/sgnEbjqcu9M4/2PdZmI7ScUmNrgp6n+9 wduoGGvDCVPs3jXYVYPHuctw1y4xmwF/uE7ktlhl/M4ocN1W/fd0P4WV0n2AttC29Ln+ 8O+VnPuaUt965uOQf176d1uW/e/lBJlbaIhMkYLtvW6fSpi6eADHTB98T/vWL7ePr6vk P7Mq8yZAM9659SzPjFYsc6ihARJFf9XgWpXKUTYgS1gNhrESKzR7de7ck8ncCH18cbh+ t7DQ== X-Received: by 10.50.118.97 with SMTP id kl1mr2763245igb.23.1425766803347; Sat, 07 Mar 2015 14:20:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.93.208 with HTTP; Sat, 7 Mar 2015 14:19:41 -0800 (PST) In-Reply-To: <71B4A6DC-CE4D-43F5-8CBE-B6FCCF68475B@vast.com> References: <1425390004.830307.234875049.55B00794@webmail.messagingengine.com> <1425399055.871834.234948865.2E0D7303@webmail.messagingengine.com> <71B4A6DC-CE4D-43F5-8CBE-B6FCCF68475B@vast.com> From: Eric Stevens Date: Sat, 7 Mar 2015 15:19:41 -0700 Message-ID: Subject: Re: best practices for time-series data with massive amounts of records To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=089e013a1118d8a8600510ba33cc X-Virus-Checked: Checked by ClamAV on apache.org --089e013a1118d8a8600510ba33cc Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable It's probably quite rare for extremely large time series data to be querying the whole set of data. Instead there's almost always a "Between X and Y dates" aspect to nearly every real time query you might have against a table like this (with the exception of "most recent N events"). Because of this, time bucketing can be an effective strategy, though until you understand your data better, it's hard to know how large (or small) to make your buckets. Because of *that*, I recommend using timestamp data type for your bucketing strategy - this gives you the advantage of being able to reduce your bucket sizes while keeping your at-rest data mostly still quite accessible. What I mean is that if you change your bucketing strategy from day to hour, when you are querying across that changed time period, you can iterate at the finer granularity buckets (hour), and you'll pick up the coarser granularity (day) automatically for all but the earliest bucket (which is easy to correct for when you're flooring your start bucket). In the coarser time period, most reads are partition key misses, which are extremely inexpensive in Cassandra. If you do need most-recent-N queries for broad ranges and you expect to have some users whose clickrate is dramatically less frequent than your bucket interval (making iterating over buckets inefficient), you can keep a separate counter table with PK of ((user_id), bucket) in which you count new events. Now you can identify the exact set of buckets you need to read to satisfy the query no matter what the user's click volume is (so very low volume users have at most N partition keys queried, higher volume users query fewer partition keys). On Fri, Mar 6, 2015 at 4:06 PM, graham sanderson wrote: > Note that using static column(s) for the =E2=80=9Chead=E2=80=9D value, an= d trailing TTLed > values behind is something we=E2=80=99re considering. Note this is especi= ally nice > if your head state includes say a map which is updated by small deltas > (individual keys) > > We have not yet studied the effect of static columns on say DTCS > > > On Mar 6, 2015, at 4:42 PM, Clint Kelly wrote: > > Hi all, > > Thanks for the responses, this was very helpful. > > I don't know yet what the distribution of clicks and users will be, but I > expect to see a few users with an enormous amount of interactions and mos= t > users having very few. The idea of doing some additional manual > partitioning, and then maintaining another table that contains the "head" > partition for each user makes sense, although it would add additional > latency when we want to get say the most recent 1000 interactions for a > given user (which is something that we have to do sometimes for > applications with tight SLAs). > > FWIW I doubt that any users will have so many interactions that they > exceed what we could reasonably put in a row, but I wanted to have a > strategy to deal with this. > > Having a nice design pattern in Cassandra for maintaining a row with the > N-most-recent interactions would also solve this reasonably well, but I > don't know of any way to implement that without running batch jobs that > periodically clean out data (which might be okay). > > Best regards, > Clint > > > > > On Tue, Mar 3, 2015 at 8:10 AM, mck wrote: > >> >> > Here "partition" is a random digit from 0 to (N*M) >> > where N=3Dnodes in cluster, and M=3Darbitrary number. >> >> >> Hopefully it was obvious, but here (unless you've got hot partitions), >> you don't need N. >> ~mck >> > > > --089e013a1118d8a8600510ba33cc Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
It's probably quite rare for extremely large time seri= es data to be querying the whole set of data.=C2=A0 Instead there's alm= ost always a "Between X and Y dates" aspect to nearly every real = time query you might have against a table like this (with the exception of = "most recent N events").

Because of this, time= bucketing can be an effective strategy, though until you understand your d= ata better, it's hard to know how large (or small) to make your buckets= .=C2=A0 Because of that, I recommend using timestamp data type for y= our bucketing strategy - this gives you the advantage of being able to redu= ce your bucket sizes while keeping your at-rest data mostly still quite acc= essible.

What I mean is that if you change your bu= cketing strategy from day to hour, when you are querying across that change= d time period, you can iterate at the finer granularity buckets (hour), and= you'll pick up the coarser granularity (day) automatically for all but= the earliest bucket (which is easy to correct for when you're flooring= your start bucket).=C2=A0 In the coarser time period, most reads are parti= tion key misses, which are extremely inexpensive in Cassandra.
If you do need most-recent-N queries for broad ranges and you = expect to have some users whose clickrate is dramatically less frequent tha= n your bucket interval (making iterating over buckets inefficient), you can= keep a separate counter table with PK of ((user_id), bucket) in which you = count new events.=C2=A0 Now you can identify the exact set of buckets you n= eed to read to satisfy the query no matter what the user's click volume= is (so very low volume users have at most N partition keys queried, higher= volume users query fewer partition keys).

On Fri, Mar 6, 2015 at 4:06 PM, graham= sanderson <graham@vast.com> wrote:
Note that using static column(= s) for the =E2=80=9Chead=E2=80=9D value, and trailing TTLed values behind i= s something we=E2=80=99re considering. Note this is especially nice if your= head state includes say a map which is updated by small deltas (individual= keys)

We have not yet studied the effect of static colu= mns on say DTCS


On Mar 6, 2015, at 4:42 PM, Clint Kelly <clint.kelly@gmail.com> wrote= :

Hi all,

Thanks for the = responses, this was very helpful.

I don't know= yet what the distribution of clicks and users will be, but I expect to see= a few users with an enormous amount of interactions and most users having = very few.=C2=A0 The idea of doing some additional manual partitioning, and = then maintaining another table that contains the "head" partition= for each user makes sense, although it would add additional latency when w= e want to get say the most recent 1000 interactions for a given user (which= is something that we have to do sometimes for applications with tight SLAs= ).

FWIW I doubt that any users will have so many i= nteractions that they exceed what we could reasonably put in a row, but I w= anted to have a strategy to deal with this.

Having= a nice design pattern in Cassandra for maintaining a row with the N-most-r= ecent interactions would also solve this reasonably well, but I don't k= now of any way to implement that without running batch jobs that periodical= ly clean out data (which might be okay).

Best= regards,
Clint



On T= ue, Mar 3, 2015 at 8:10 AM, mck <mck@apache.org> wrote:

> Here "partition" is a random digit from 0 to (N*M)
> where N=3Dnodes in cluster, and M=3Darbitrary number.


Hopefully it was obvious, but here (unless you've got hot partit= ions),
you don't need N.
~mck



--089e013a1118d8a8600510ba33cc--