cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5226) CQL3 refactor to allow conversion function
Date Wed, 06 Feb 2013 21:13:13 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-5226:
--------------------------------------

    Reviewer: iamaleksey
    
> CQL3 refactor to allow conversion function
> ------------------------------------------
>
>                 Key: CASSANDRA-5226
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5226
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.2
>
>         Attachments: 0001-Refactor-to-support-CQL3-functions.txt, 0002-Allow-functions-in-selection.txt,
0003-Add-bytes-conversion-functions.txt
>
>
> In CASSANDRA-5198, we've fixed CQL3 type validation and talked about adding conversion
functions to ease working with the different data types. However, the current CQL3 code makes
it fairly hard to add such functions in a non-hacky way. In fact, we already support a few
conversion functions (token, minTimeuuid, maxTimeuuid, now) but the way we support them is
extremely ugly (the token function is competely special cased and min/maxTimeuuid are ugly
hacks in TimeUUIDType that I'm really not proud of).
> So I'm attaching a refactor that cleans that up, making it easy to add new conversion
functions. Now, said refactor is a big one. While the goal is to make it easy to add functions,
I think it also improve the code in the following ways:
> * It much more clearly separate the phase of validating the query from executing it.
And in particular, it moves more work in the preparing phase.  Typically, the parsing of constants
is now done in the preparing phase, not the execution one. It also groups validation code
much more cleanly imo.
> * It simplify UpdateStatement. The Operation business was not very clean and in particular
the same operations where not handled by the same code depending on whether they were prepared
or not, which was error prone. This is no longer the case.
> * It somewhat simplify the parser. A few parsing rules were a bit too convoluted, trying
to enforce invariants that are much more easily checked post parsing (and doing it post parsing
often allow better error messages, the parser tends to send cryptic errors).
> The first attached part is the initial refactor. It also adds some relatively generic
code for adding conversion functions (it would typically not be very hard to allow user defined
functions, though that's not part of the patch at all) and uses that to handle the existing
token, minTimeuuid and maxTimeuuid functions.
> It's also worth mentioning that this first patch introduces type casts. The main reason
is that it allows multiple overloads of the same function. Typically, the minTimeuuid allows
both strings arguments (for dates) or integer ones (for timestamp), but so when you have:
> {noformat}
> SELECT * FROM foo WHERE t > minTimeuuid(?);
> {noformat}
> then the code doesn't know which function to use. So it complains. But you can remove
the ambiguity with
> {noformat}
> SELECT * FROM foo WHERE t > minTimeuuid((bigint)?);
> {noformat}
> The 2nd patch finishes what the first one started by extending this conversion functions
support to select clauses. So after this 2nd patch you can do stuff like:
> {noformat}
> SELECT token(k), k FROM foo;
> {noformat}
> for instance.
> The 3rd patch builds on that to actually add new conversion functions. Namely, for every
existing CQL3 <type> it adds a blobTo<type> and a <type>ToBlob function
that convert from and to blobs. And so you can do (not that this example is particularly smart):
> {noformat}
> SELECT varintToBlob(v) FROM foo WHERE v > blobToVarint(bigintToBlob(3));
> {noformat}
> Honestly this last patch is more for demonstration purpose and we can discuss this separately.
In particular, we may want better different names for those functions. But at least it should
highlight that adding new function is easy (this could be used to add methods to work with
dates for instance).
> Now, at least considering the 2 first patches, this is not a small amount of code but
I would still suggest pushing this in 1.2 (the patches are against 1.2) for the following
reasons:
>  * It fixes a few existing bugs (CASSANDRA-5198 has broke prepared statement for instance,
which this patch fixes) and add missing validation in a few places (we are allowing sets literals
like \{ 1, 1 \} for instance, which is kind of wrong as it suggests we support multisets).
We could fix those separatly but honestly I'm not we won't miss some.
>  * We do have a fair amount of CQL dtests and i've check all pass. The refactor also
cleans up some part of the code quite a bit imo. So overall I think I'm almost more confident
in the code post-refactor than the current one.
>  * We're early in 1.2 and it's an improvement after all. It would be a bit sad to have
to wait for 2.0 to get this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message