cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10217) Support custom query expressions in SELECT
Date Thu, 17 Sep 2015 18:26:04 GMT


Sam Tunnicliffe commented on CASSANDRA-10217:

bq. I would have a rather strong preference for only allowing 1 custom expression per query

Cool, I'm more comfortable with that myself too.

bq. I'd prefer getting the {{RowFilter.CustomExpression}} serialization as clean as possible

Done, and it turns out we also weren't including the additional byte for {{Kind}} in {{serializedSize}},
so I've added that too.

bq. I'd prefer throwing an exception saying you need to upgrade all your nodes before doing
that kind of queries.

Added a check via {{MessagingService}} and reject the use of custom expressions when there
are nodes on < 3.0

bq. we could allow any type by having the custom index actually tell us which type it expects

Done, added {{Index::customExpressionValueType}} & using a null value to infer that custom
expression are not supported.

bq. I'd prefer the {{toString()}} method of {{RowFilter.CustomExpression}} to return...just
{{String.format("expr(%s, %s)",, UTF8Type.instance.getString(value))}}.

Based on the previous point, it's using the actual type rather than assuming a string, which
involves getting the {{CFS}} and {{Index}} in {{toString}}, but it shouldn't be on any hot
path AFAICT. 

Thanks, I've pushed a new commits to address these points.

> Support custom query expressions in SELECT
> ------------------------------------------
>                 Key: CASSANDRA-10217
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.0.0 rc1
> (Broken out of CASSANDRA-10124)
> Custom index implementations often support query expressions which do not fit the structure
of CQL. To support these, it has been necessary to add a fake column to the base table and
query that using the custom syntax. Taking an example from the [Stratio docs|]:
> {code}
> SELECT * FROM tweets WHERE lucene='{
>     filter : {type:"range", field:"time", lower:"2014/04/25", upper:"2014/05/1"},
>     query  : {type:"phrase", field:"body", value:"big data gives organizations", slop:1}
> }' 
> {code}
> The {{lucene}} field is a dummy column that has to be added to the table in order to
associate the pre-3.0 row-based index with the {{tweets}} table. We could rewrite this query
> {code}
> SELECT * FROM tweets 
> WHERE expr(lucene, '{filter : {type:"range", field:"time", lower:"2014/04/25", upper:"2014/05/1"},
>              query  : {type:"phrase", field:"body", value:"big data gives organizations",
> {code}
> In this version the {{expr}} function takes 2 arguments: the first is the name of the
index being targetted, {{lucene}} and the second is the query string itself. 
> Parsing and validation of those expressions would be delegated to the custom index implementations
which support them. 
> One thing to consider is index selection. If a query contains custom expressions, but
the target index is not selected, C* has no way to use the custom expressions as a post-query
filter, like it does with standard expressions & {{ALLOW FILTERING}}. To compensate for
that, index selection should be weighted in favour of indexes targetted by custom expressions.
At least in the first instance, we should also restrict queries to targetting a single index
via custom expressions, i.e. disallow queries like {{SELECT * FROM t WHERE expr(index1, 'foo')
AND expr(index2, 'bar')}}

This message was sent by Atlassian JIRA

View raw message