Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 500F8E3FE for ; Wed, 9 Jan 2013 19:21:54 +0000 (UTC) Received: (qmail 80410 invoked by uid 500); 9 Jan 2013 19:21:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 80382 invoked by uid 500); 9 Jan 2013 19:21:51 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 80373 invoked by uid 99); 9 Jan 2013 19:21:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jan 2013 19:21:51 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tyler@datastax.com designates 209.85.215.53 as permitted sender) Received: from [209.85.215.53] (HELO mail-la0-f53.google.com) (209.85.215.53) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jan 2013 19:21:47 +0000 Received: by mail-la0-f53.google.com with SMTP id fn20so2225209lab.26 for ; Wed, 09 Jan 2013 11:21:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=//AJ8XPdAZS5b8tUdvz35rPvpP/xmUdtZMD/77oupUs=; b=Qa4f7yDo+jFJnJJVJJCA34QYTFkm+ECDLh16zRu0fqDqd2PeuLsWEv7ww9f0DpoH/R qAIXqTF7L+5wv+N4wr+jg4JjGsTbITamoaXE4IQrD+w7ylLsFPC05QS1ruWlIAybMQmO 1wjvw0RImTHNqbFDaWKhEeKoaP+M8niMr7nljiyQ4rvNGttJxP/PQv/quhJm/qICyPdD WYau89yWmae2AwsfypMA66gY3sY/aQUpKz6eRi7FWyFTEzMtdHBLSfJhRZZPOWGEx88V QtbEzs3Qdq8rWCvSGsIbMofqVC0AU4PZYmZ1sKVxf35Ia0h1Y7IDXKhJpAVMgys5IZme hSuA== MIME-Version: 1.0 Received: by 10.152.125.237 with SMTP id mt13mr66444390lab.45.1357759285627; Wed, 09 Jan 2013 11:21:25 -0800 (PST) Received: by 10.112.102.37 with HTTP; Wed, 9 Jan 2013 11:21:25 -0800 (PST) In-Reply-To: <8982CA96-6FCE-49FB-9DE7-B3386D2EFB8C@barracuda.com> References: <333B362E7B77B344A2D0FD92840282611F7F28CA3E@MSGCMSIL1003.ent.wfb.bank.corp> <333B362E7B77B344A2D0FD92840282611F7F28D219@MSGCMSIL1003.ent.wfb.bank.corp> <8982CA96-6FCE-49FB-9DE7-B3386D2EFB8C@barracuda.com> Date: Wed, 9 Jan 2013 13:21:25 -0600 Message-ID: Subject: Re: Date Index? From: Tyler Hobbs To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d04426ccce90b8604d2dff72a X-Gm-Message-State: ALoCoQmCQ/KKOC1O/MuMEaSeqVODsTqmHjRhDP5jQgTsIvZ6lpkWyfV+r7w/aLPsyGEpKeDz5RlJ X-Virus-Checked: Checked by ClamAV on apache.org --f46d04426ccce90b8604d2dff72a Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable If you're going to be looking data up by date ranges frequently, I strongly suggest you go with a typical time-series pattern (what Aaron described as hand-rolled indexes): http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/ http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra If you're just running these date-based queries occasionally and the result set won't be huge, then using secondary indexes as you described is a convenient but not terribly efficient way to do that. On Wed, Jan 9, 2013 at 10:04 AM, Michael Kjellman wrote: > ElasticSearch is a nice option for ordered lists. In 2.0 triggers would > fit updates to elastic search much easier as right now it's in your > application logic to detect changes and update. > > On Jan 9, 2013, at 7:55 AM, "Stephen.M.Thompson@wellsfargo.com" < > Stephen.M.Thompson@wellsfargo.com> wrote: > > Thanks Aaron, that helps. So is there anything approaching a =93consensu= s=94 > of how to do something like this? **** > > ** ** > > You mention a custom index =85 is there a good document on creating a cus= tom > index? Google doesn=92t show me much.**** > > ** ** > > Steve**** > > ** ** > > *From:* aaron morton [mailto:aaron@thelastpickle.com] > > *Sent:* Tuesday, January 08, 2013 9:35 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Date Index?**** > > ** ** > > There has to be one equality clause in there, and thats the thing to > cassandra uses to select of disk. The others are in memory filters. **** > > ** ** > > So if you have one on the year+month you can have a simple select clause > and it limits the amount of data that has to be read. **** > > ** ** > > If you have like many 10's to 100's millions of things in the same month > you may want to do some performance testing. There can still be times whe= n > you want to support common read paths by using custom / hand rolled index= es. > **** > > ** ** > > Cheers**** > > ** ** > > -----------------**** > > Aaron Morton**** > > Freelance Cassandra Developer**** > > New Zealand**** > > ** ** > > @aaronmorton**** > > http://www.thelastpickle.com**** > > ** ** > > On 9/01/2013, at 6:05 AM, Stephen.M.Thompson@wellsfargo.com wrote:**** > > > > **** > > Hi folks =96**** > > **** > > Question about secondary indexes. How are people doing date indexes? = I > have a date column in my tables in RDBMS that we use frequently, such as > look at all records recorded in the last month. What is the best practic= e > for being able to do such a query? It seems like there could be an > advantage to adding a couple of columns like this:**** > > **** > > {timestamp=3D2013/01/08 12:32:01 -0500}**** > > {month=3D201301}**** > > {day=3D08}**** > > **** > > And then I could do secondary index on the month and day columns? Would > that be the best way to do something like this? Is there any accepted > =93best practice=94 on this yet?**** > > **** > > Thanks!**** > > Steve**** > > ** ** > > > ---------------------------------- > Join Barracuda Networks in the fight against hunger. > To learn how you can help in your community, please visit: > http://on.fb.me/UAdL4f > =AD=AD > --=20 Tyler Hobbs DataStax --f46d04426ccce90b8604d2dff72a Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable


On Wed,= Jan 9, 2013 at 10:04 AM, Michael Kjellman <mkjellman@barracuda.com<= /a>> wrote:
ElasticSearch is a ni= ce option for ordered lists. In 2.0 triggers would fit updates to elastic s= earch much easier as right now it's in your application logic to detect= changes and update.=A0

Thanks Aaron, that helps.=A0 So is= there anything approaching a =93consensus=94 of how to do something like t= his?=A0

=A0

You mention a custom index =85 is there a good doc= ument on creating a custom index?=A0 Google doesn=92t show me much.<= u>

=A0

Steve

=A0

=A0

There ha= s to be one equality clause in there, and thats the thing to cassandra uses= to select of disk. The others are in memory filters.=A0

<= div>

=A0

=A0

If you have like many 10's to 100's millions of things in t= he same month you may want to do some performance testing. There can still = be times when you want to support common read paths by using custom / hand = rolled indexes.

=A0

Cheers

= =A0

=A0=



=A0

<= /div>

=A0=A0=AD=AD=A0=A0



-- Tyler Hobbs
DataStax
--f46d04426ccce90b8604d2dff72a--