Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of tyler@datastax.com designates
 209.85.215.53 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <8982CA96-6FCE-49FB-9DE7-B3386D2EFB8C@barracuda.com>
References: 
 <333B362E7B77B344A2D0FD92840282611F7F28CA3E@MSGCMSIL1003.ent.wfb.bank.corp>
	<AB9B82B9-395A-4AB9-9B11-09CF651CDF3E@thelastpickle.com>
	<333B362E7B77B344A2D0FD92840282611F7F28D219@MSGCMSIL1003.ent.wfb.bank.corp>
	<8982CA96-6FCE-49FB-9DE7-B3386D2EFB8C@barracuda.com>
Date: Wed, 9 Jan 2013 13:21:25 -0600
Message-ID: 
 <CAAam9ssfUQNBHezfkGH-vXYLU74DzhkwtJ4JwKfrj=1q5UqyhA@mail.gmail.com>
Subject: Re: Date Index?
From: Tyler Hobbs <tyler@datastax.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=f46d04426ccce90b8604d2dff72a

--f46d04426ccce90b8604d2dff72a
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

If you're going to be looking data up by date ranges frequently, I strongly
suggest you go with a typical time-series pattern (what Aaron described as
hand-rolled indexes):

http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra

If you're just running these date-based queries occasionally and the result
set won't be huge, then using secondary indexes as you described is a
convenient but not terribly efficient way to do that.


On Wed, Jan 9, 2013 at 10:04 AM, Michael Kjellman
<mkjellman@barracuda.com>wrote:

> ElasticSearch is a nice option for ordered lists. In 2.0 triggers would
> fit updates to elastic search much easier as right now it's in your
> application logic to detect changes and update.
>
> On Jan 9, 2013, at 7:55 AM, "Stephen.M.Thompson@wellsfargo.com" <
> Stephen.M.Thompson@wellsfargo.com> wrote:
>
> Thanks Aaron, that helps.  So is there anything approaching a =93consensu=
s=94
> of how to do something like this?  ****
>
> ** **
>
> You mention a custom index =85 is there a good document on creating a cus=
tom
> index?  Google doesn=92t show me much.****
>
> ** **
>
> Steve****
>
> ** **
>
> *From:* aaron morton [mailto:aaron@thelastpickle.com<aaron@thelastpickle.=
com>]
>
> *Sent:* Tuesday, January 08, 2013 9:35 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Date Index?****
>
> ** **
>
> There has to be one equality clause in there, and thats the thing to
> cassandra uses to select of disk. The others are in memory filters. ****
>
> ** **
>
> So if you have one on the year+month you can have a simple select clause
> and it limits the amount of data that has to be read. ****
>
> ** **
>
> If you have like many 10's to 100's millions of things in the same month
> you may want to do some performance testing. There can still be times whe=
n
> you want to support common read paths by using custom / hand rolled index=
es.
> ****
>
> ** **
>
> Cheers****
>
> ** **
>
> -----------------****
>
> Aaron Morton****
>
> Freelance Cassandra Developer****
>
> New Zealand****
>
> ** **
>
> @aaronmorton****
>
> http://www.thelastpickle.com****
>
> ** **
>
> On 9/01/2013, at 6:05 AM, Stephen.M.Thompson@wellsfargo.com wrote:****
>
>
>
> ****
>
> Hi folks =96****
>
>  ****
>
> Question about secondary indexes.  How are people doing date indexes?    =
I
> have a date column in my tables in RDBMS that we use frequently, such as
> look at all records recorded in the last month.  What is the best practic=
e
> for being able to do such a query?  It seems like there could be an
> advantage to adding a couple of columns like this:****
>
>  ****
>
>                 {timestamp=3D2013/01/08 12:32:01 -0500}****
>
>                 {month=3D201301}****
>
>                 {day=3D08}****
>
>  ****
>
> And then I could do secondary index on the month and day columns?  Would
> that be the best way to do something like this?  Is there any accepted
> =93best practice=94 on this yet?****
>
>  ****
>
> Thanks!****
>
> Steve****
>
> ** **
>
>
> ----------------------------------
> Join Barracuda Networks in the fight against hunger.
> To learn how you can help in your community, please visit:
> http://on.fb.me/UAdL4f
>   =AD=AD
>


--=20
Tyler Hobbs
DataStax <http://datastax.com/>

--f46d04426ccce90b8604d2dff72a
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">If you&#39;re going to be looking data up by date ranges f=
requently, I strongly suggest you go with a typical time-series pattern (wh=
at Aaron described as hand-rolled indexes):<br><br><a href=3D"http://rubysc=
ale.com/blog/2011/03/06/basic-time-series-with-cassandra/">http://rubyscale=
.com/blog/2011/03/06/basic-time-series-with-cassandra/</a><br>
<a href=3D"http://www.datastax.com/dev/blog/advanced-time-series-with-cassa=
ndra">http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra<=
/a><br><br>If you&#39;re just running these date-based queries occasionally=
 and the result set won&#39;t be huge, then using secondary indexes as you =
described is a convenient but not terribly efficient way to do that.<br>
</div><div class=3D"gmail_extra"><br><br><div class=3D"gmail_quote">On Wed,=
 Jan 9, 2013 at 10:04 AM, Michael Kjellman <span dir=3D"ltr">&lt;<a href=3D=
"mailto:mkjellman@barracuda.com" target=3D"_blank">mkjellman@barracuda.com<=
/a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"auto"><div>ElasticSearch is a ni=
ce option for ordered lists. In 2.0 triggers would fit updates to elastic s=
earch much easier as right now it&#39;s in your application logic to detect=
 changes and update.=A0</div>
<div><div class=3D"h5"><div><br>On Jan 9, 2013, at 7:55 AM, &quot;<a href=
=3D"mailto:Stephen.M.Thompson@wellsfargo.com" target=3D"_blank">Stephen.M.T=
hompson@wellsfargo.com</a>&quot; &lt;<a href=3D"mailto:Stephen.M.Thompson@w=
ellsfargo.com" target=3D"_blank">Stephen.M.Thompson@wellsfargo.com</a>&gt; =
wrote:<br>
<br></div><blockquote type=3D"cite"><div>
<div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&qu=
ot;Candara&quot;,&quot;sans-serif&quot;">Thanks Aaron, that helps.=A0 So is=
 there anything approaching a =93consensus=94 of how to do something like t=
his?=A0 <u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
ndara&quot;,&quot;sans-serif&quot;"><u></u>=A0<u></u></span></p><p class=3D=
"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Candara&quot;=
,&quot;sans-serif&quot;">You mention a custom index =85 is there a good doc=
ument on creating a custom index?=A0 Google doesn=92t show me much.<u></u><=
u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
ndara&quot;,&quot;sans-serif&quot;"><u></u>=A0<u></u></span></p><p class=3D=
"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Candara&quot;=
,&quot;sans-serif&quot;">Steve<u></u><u></u></span></p>
<p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Ca=
ndara&quot;,&quot;sans-serif&quot;"><u></u>=A0<u></u></span></p><div><div s=
tyle=3D"border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0i=
n"><p class=3D"MsoNormal">
<b><span style=3D"font-size:10.0pt;font-family:&quot;Tahoma&quot;,&quot;san=
s-serif&quot;">From:</span></b><span style=3D"font-size:10.0pt;font-family:=
&quot;Tahoma&quot;,&quot;sans-serif&quot;"> aaron morton [<a href=3D"mailto=
:aaron@thelastpickle.com" target=3D"_blank">mailto:aaron@thelastpickle.com<=
/a>] <br>
<b>Sent:</b> Tuesday, January 08, 2013 9:35 PM<br><b>To:</b> <a href=3D"mai=
lto:user@cassandra.apache.org" target=3D"_blank">user@cassandra.apache.org<=
/a><br><b>Subject:</b> Re: Date Index?<u></u><u></u></span></p></div></div>
<p class=3D"MsoNormal"><u></u>=A0<u></u></p><p class=3D"MsoNormal">There ha=
s to be one equality clause in there, and thats the thing to cassandra uses=
 to select of disk. The others are in memory filters.=A0<u></u><u></u></p><=
div>
<p class=3D"MsoNormal"><u></u>=A0<u></u></p></div><div><p class=3D"MsoNorma=
l">So if you have one on the year+month you can have a simple select clause=
 and it limits the amount of data that has to be read.=A0<u></u><u></u></p>=
</div>
<div><p class=3D"MsoNormal"><u></u>=A0<u></u></p></div><div><p class=3D"Mso=
Normal">If you have like many 10&#39;s to 100&#39;s millions of things in t=
he same month you may want to do some performance testing. There can still =
be times when you want to support common read paths by using custom / hand =
rolled indexes.<u></u><u></u></p>
</div><div><p class=3D"MsoNormal"><u></u>=A0<u></u></p></div><div><p class=
=3D"MsoNormal">Cheers<u></u><u></u></p><div><p class=3D"MsoNormal"><u></u>=
=A0<u></u></p><div><div><div><div><div><div><div><p class=3D"MsoNormal"><sp=
an style=3D"font-size:13.5pt;font-family:&quot;Helvetica&quot;,&quot;sans-s=
erif&quot;">-----------------<u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:13.5pt;font-fami=
ly:&quot;Helvetica&quot;,&quot;sans-serif&quot;">Aaron Morton<u></u><u></u>=
</span></p></div><div><p class=3D"MsoNormal"><span style=3D"font-size:13.5p=
t;font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;">Freelance Cassa=
ndra Developer<u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:13.5pt;font-fami=
ly:&quot;Helvetica&quot;,&quot;sans-serif&quot;">New Zealand<u></u><u></u><=
/span></p></div><div><p class=3D"MsoNormal"><span style=3D"font-size:13.5pt=
;font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;"><u></u>=A0<u></u=
></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:13.5pt;font-fami=
ly:&quot;Helvetica&quot;,&quot;sans-serif&quot;">@aaronmorton<u></u><u></u>=
</span></p></div><div><p class=3D"MsoNormal"><span style=3D"font-size:13.5p=
t;font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;"><a href=3D"http=
://www.thelastpickle.com" target=3D"_blank">http://www.thelastpickle.com</a=
><u></u><u></u></span></p>
</div></div></div></div></div></div></div><p class=3D"MsoNormal"><u></u>=A0=
<u></u></p><div><div><p class=3D"MsoNormal">On 9/01/2013, at 6:05 AM, <a hr=
ef=3D"mailto:Stephen.M.Thompson@wellsfargo.com" target=3D"_blank">Stephen.M=
.Thompson@wellsfargo.com</a> wrote:<u></u><u></u></p>
</div><p class=3D"MsoNormal"><br><br><u></u><u></u></p><div><div><p class=
=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-family:&quot;Candara&qu=
ot;,&quot;sans-serif&quot;">Hi folks =96</span><span style=3D"font-size:11.=
0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></u><u></u><=
/span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0</span><span style=3D"fon=
t-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></=
u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">Question about secondary ind=
exes.=A0 How are people doing date indexes?=A0 =A0=A0I have a date column i=
n my tables in RDBMS that we use frequently, such as look at all records re=
corded in the last month.=A0 What is the best practice for being able to do=
 such a query?=A0 It seems like there could be an advantage to adding a cou=
ple of columns like this:</span><span style=3D"font-size:11.0pt;font-family=
:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0</span><span style=3D"fon=
t-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></=
u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0 {timestamp=3D2013/01/08 12:32:01 -0500}</span><span styl=
e=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot=
;"><u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0 {month=3D201301}</span><span style=3D"font-size:11.0pt;f=
ont-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></u><u></u></span=
></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0 {day=3D08}</span><span style=3D"font-size:11.0pt;font-fa=
mily:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0</span><span style=3D"fon=
t-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></=
u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">And then I could do secondar=
y index on the month and day columns?=A0 Would that be the best way to do s=
omething like this?=A0 Is there any accepted =93best practice=94 on this ye=
t?</span><span style=3D"font-size:11.0pt;font-family:&quot;Calibri&quot;,&q=
uot;sans-serif&quot;"><u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">=A0</span><span style=3D"fon=
t-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u></=
u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">Thanks!</span><span style=3D=
"font-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><=
u></u><u></u></span></p>
</div><div><p class=3D"MsoNormal"><span style=3D"font-size:11.0pt;font-fami=
ly:&quot;Candara&quot;,&quot;sans-serif&quot;">Steve</span><span style=3D"f=
ont-size:11.0pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;"><u>=
</u><u></u></span></p>
</div></div></div><p class=3D"MsoNormal"><u></u>=A0<u></u></p></div></div><=
/div></div></blockquote><br></div></div><div>
---------------------------------- <br>
Join Barracuda Networks in the fight against hunger.<br>
To learn how you can help in your community, please visit: <a href=3D"http:=
//on.fb.me/UAdL4f" target=3D"_blank">http://on.fb.me/UAdL4f</a>

</div>
=A0=A0=AD=AD=A0=A0</div></blockquote></div><br><br clear=3D"all"><br>-- <br=
><font color=3D"#888888">Tyler Hobbs<span></span><br>
<a href=3D"http://datastax.com/" target=3D"_blank">DataStax</a><br></font>
</div>

--f46d04426ccce90b8604d2dff72a--