cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8855) Batching SELECTs
Date Mon, 23 Feb 2015 23:33:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14334081#comment-14334081
] 

Sylvain Lebresne commented on CASSANDRA-8855:
---------------------------------------------

Well, if we want to support that, the proper syntax is just to allow {{OR}} clauses, as in:
{noformat}
SELECT value FROM events WHERE event_type='myEvent' AND (time > '2011-02-03' AND time <=
'2012-01-01') AND (time > '2012-02-03' AND time <= '2013-01-01’);
{noformat}
And thechnically speaking, doing so is really just about allowing and handling the new syntax
since this we have everything we need internally to do this. The only reason we haven't implemented
it so far is because supporting {{OR}} properly in {{SelectStatement}} will require a bit
of refactor (though it has probably gotten better on trunk). We also won't really be able
to support any type of {{OR}} clause so we need to be clear on what we support and what we
don't.

bq. In addition, how about supporting batch for multiple SELECTs across tables

As you can easily parallelize such query yourself client side, I don't see much upside to
it and that's a *lot* of complication. So I'm fairly strongly opposed to the idea (but in
any case, it's a pretty separate question so let's focus on one thing per ticket).



>  Batching SELECTs 
> ------------------
>
>                 Key: CASSANDRA-8855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8855
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jay Patel
>
> SELECT’s IN clause allows to batch selects for multiple partition keys of a given table.
Can we consider supporting batch select for multiple column ranges for a given partition key?
> For instance, would like to batch below two or more SELECTs for a given partition key
“event_type” and the different ranges of the “time”:
> SELECT value
> FROM events
> WHERE event_type = 'myEvent'
>   AND time > '2011-02-03'
>   AND time <= '2012-01-01'
> SELECT value
> FROM events
> WHERE event_type = 'myEvent'
>   AND time > '2012-02-03'
>   AND time <= '2013-01-01’
> One way to optimize these is to fire multiple SELECTs in parallel & async from the
application, but by batching them we can do further optimizations such as avoid multiple round
trips; from app server to C*, and even from coordinator to the replicas. Once request is received
by the target replicas, we can return all the ranges requested for a particular partition
key in one shot. 
> This will be very useful for some of the use cases we're working on. I can take a first
cut at this if no concerns.
> In addition, how about supporting batch for multiple SELECTs across tables. I think that
will require more changes in ResultSet and may not have lot of opportunities for optimizations.
However, at least it will help to avoid multiple round trips from app server to the C*.
> Thoughts welcome. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message