cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: Date Index?
Date Wed, 09 Jan 2013 19:21:25 GMT
If you're going to be looking data up by date ranges frequently, I strongly
suggest you go with a typical time-series pattern (what Aaron described as
hand-rolled indexes):

http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra

If you're just running these date-based queries occasionally and the result
set won't be huge, then using secondary indexes as you described is a
convenient but not terribly efficient way to do that.


On Wed, Jan 9, 2013 at 10:04 AM, Michael Kjellman
<mkjellman@barracuda.com>wrote:

> ElasticSearch is a nice option for ordered lists. In 2.0 triggers would
> fit updates to elastic search much easier as right now it's in your
> application logic to detect changes and update.
>
> On Jan 9, 2013, at 7:55 AM, "Stephen.M.Thompson@wellsfargo.com" <
> Stephen.M.Thompson@wellsfargo.com> wrote:
>
> Thanks Aaron, that helps.  So is there anything approaching a “consensus”
> of how to do something like this?  ****
>
> ** **
>
> You mention a custom index … is there a good document on creating a custom
> index?  Google doesn’t show me much.****
>
> ** **
>
> Steve****
>
> ** **
>
> *From:* aaron morton [mailto:aaron@thelastpickle.com<aaron@thelastpickle.com>]
>
> *Sent:* Tuesday, January 08, 2013 9:35 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Date Index?****
>
> ** **
>
> There has to be one equality clause in there, and thats the thing to
> cassandra uses to select of disk. The others are in memory filters. ****
>
> ** **
>
> So if you have one on the year+month you can have a simple select clause
> and it limits the amount of data that has to be read. ****
>
> ** **
>
> If you have like many 10's to 100's millions of things in the same month
> you may want to do some performance testing. There can still be times when
> you want to support common read paths by using custom / hand rolled indexes.
> ****
>
> ** **
>
> Cheers****
>
> ** **
>
> -----------------****
>
> Aaron Morton****
>
> Freelance Cassandra Developer****
>
> New Zealand****
>
> ** **
>
> @aaronmorton****
>
> http://www.thelastpickle.com****
>
> ** **
>
> On 9/01/2013, at 6:05 AM, Stephen.M.Thompson@wellsfargo.com wrote:****
>
>
>
> ****
>
> Hi folks –****
>
>  ****
>
> Question about secondary indexes.  How are people doing date indexes?    I
> have a date column in my tables in RDBMS that we use frequently, such as
> look at all records recorded in the last month.  What is the best practice
> for being able to do such a query?  It seems like there could be an
> advantage to adding a couple of columns like this:****
>
>  ****
>
>                 {timestamp=2013/01/08 12:32:01 -0500}****
>
>                 {month=201301}****
>
>                 {day=08}****
>
>  ****
>
> And then I could do secondary index on the month and day columns?  Would
> that be the best way to do something like this?  Is there any accepted
> “best practice” on this yet?****
>
>  ****
>
> Thanks!****
>
> Steve****
>
> ** **
>
>
> ----------------------------------
> Join Barracuda Networks in the fight against hunger.
> To learn how you can help in your community, please visit:
> http://on.fb.me/UAdL4f
>   ­­
>



-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message