cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5514) Allow timestamp hints
Date Tue, 14 May 2013 09:59:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13656934#comment-13656934
] 

Sylvain Lebresne commented on CASSANDRA-5514:
---------------------------------------------

bq. We should be able to eliminate sstables based on the entire column name, even if it is
a composite one right?

That won't work as well as we want unfortunately. Say your table is something like:
{noformat}
CREATE TABLE timeline (
  key int,
  category string,
  time timestamp,
  value1 int,
  value2 text,
  PRIMARY KEY (key, category, time);
)
{noformat}
i.e. the (internal) column names are a composite with a category first and then a timeline
for each category (not saying that model is smart, it's just an example to illustrate :)).

Now say in one sstable you only have entries whose {{time}} is <= 10, but for {{category}}
"a" through "z". So the min/max column names might be "a:0"/"z:10". So if you do a query like:
{noformat}
SELECT * FROM timeline WHERE key = 3 AND category = "c" AND time > 100
{noformat}
then we cannot eliminate the sstable above, because "a:0" < "c:100" < "z:10".

However, if we keep each component separately, we would keep: for {{category}}, min/max =
"a"/"z" and for {{time}}, min/max = 0/10. And from that we can eliminate the sstable for the
query above since the time queried is not in the min/max range for {{time}}.


                
> Allow timestamp hints
> ---------------------
>
>                 Key: CASSANDRA-5514
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5514
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Marcus Eriksson
>             Fix For: 2.0
>
>         Attachments: 0001-CASSANDRA-5514-v1.patch
>
>
> Slice queries can't optimize based on timestamp except for rare cases (CASSANDRA-4116).
 However, many common queries involve an implicit time component, where the application author
knows that he is only interested in data more recent than X, or older than Y.
> We could use the per-sstable max and min timestamps we track to avoid touching cold data
if we could pass a hint to Cassandra about the time range we care about.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message