Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5C02CE3B9 for ; Thu, 7 Feb 2013 22:35:13 +0000 (UTC) Received: (qmail 49847 invoked by uid 500); 7 Feb 2013 22:35:13 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 49760 invoked by uid 500); 7 Feb 2013 22:35:13 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 49669 invoked by uid 99); 7 Feb 2013 22:35:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2013 22:35:13 +0000 Date: Thu, 7 Feb 2013 22:35:13 +0000 (UTC) From: "Aleksey Yeschenko (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-5226) CQL3 refactor to allow conversion function MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13574000#comment-13574000 ] Aleksey Yeschenko commented on CASSANDRA-5226: ---------------------------------------------- Also, duplicate-literals-in-sets validation breaks a test or two - SystemTable.startCompaction() sometimes does inserts that are illegal now: {noformat} String req = "INSERT INTO system.%s (id, keyspace_name, columnfamily_name, inputs) VALUES (%s, '%s', '%s', {%s})"; ... processInternal(String.format(req, COMPACTION_LOG, compactionId, cfs.table.name, cfs.columnFamily, StringUtils.join(generations.iterator(), ','))); {noformat} But that's not this issue's concern. > CQL3 refactor to allow conversion function > ------------------------------------------ > > Key: CASSANDRA-5226 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5226 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Fix For: 1.2.2 > > Attachments: 0001-Refactor-to-support-CQL3-functions.txt, 0002-Allow-functions-in-selection.txt, 0003-Add-bytes-conversion-functions.txt > > > In CASSANDRA-5198, we've fixed CQL3 type validation and talked about adding conversion functions to ease working with the different data types. However, the current CQL3 code makes it fairly hard to add such functions in a non-hacky way. In fact, we already support a few conversion functions (token, minTimeuuid, maxTimeuuid, now) but the way we support them is extremely ugly (the token function is competely special cased and min/maxTimeuuid are ugly hacks in TimeUUIDType that I'm really not proud of). > So I'm attaching a refactor that cleans that up, making it easy to add new conversion functions. Now, said refactor is a big one. While the goal is to make it easy to add functions, I think it also improve the code in the following ways: > * It much more clearly separate the phase of validating the query from executing it. And in particular, it moves more work in the preparing phase. Typically, the parsing of constants is now done in the preparing phase, not the execution one. It also groups validation code much more cleanly imo. > * It simplify UpdateStatement. The Operation business was not very clean and in particular the same operations where not handled by the same code depending on whether they were prepared or not, which was error prone. This is no longer the case. > * It somewhat simplify the parser. A few parsing rules were a bit too convoluted, trying to enforce invariants that are much more easily checked post parsing (and doing it post parsing often allow better error messages, the parser tends to send cryptic errors). > The first attached part is the initial refactor. It also adds some relatively generic code for adding conversion functions (it would typically not be very hard to allow user defined functions, though that's not part of the patch at all) and uses that to handle the existing token, minTimeuuid and maxTimeuuid functions. > It's also worth mentioning that this first patch introduces type casts. The main reason is that it allows multiple overloads of the same function. Typically, the minTimeuuid allows both strings arguments (for dates) or integer ones (for timestamp), but so when you have: > {noformat} > SELECT * FROM foo WHERE t > minTimeuuid(?); > {noformat} > then the code doesn't know which function to use. So it complains. But you can remove the ambiguity with > {noformat} > SELECT * FROM foo WHERE t > minTimeuuid((bigint)?); > {noformat} > The 2nd patch finishes what the first one started by extending this conversion functions support to select clauses. So after this 2nd patch you can do stuff like: > {noformat} > SELECT token(k), k FROM foo; > {noformat} > for instance. > The 3rd patch builds on that to actually add new conversion functions. Namely, for every existing CQL3 it adds a blobTo and a ToBlob function that convert from and to blobs. And so you can do (not that this example is particularly smart): > {noformat} > SELECT varintToBlob(v) FROM foo WHERE v > blobToVarint(bigintToBlob(3)); > {noformat} > Honestly this last patch is more for demonstration purpose and we can discuss this separately. In particular, we may want better different names for those functions. But at least it should highlight that adding new function is easy (this could be used to add methods to work with dates for instance). > Now, at least considering the 2 first patches, this is not a small amount of code but I would still suggest pushing this in 1.2 (the patches are against 1.2) for the following reasons: > * It fixes a few existing bugs (CASSANDRA-5198 has broke prepared statement for instance, which this patch fixes) and add missing validation in a few places (we are allowing sets literals like \{ 1, 1 \} for instance, which is kind of wrong as it suggests we support multisets). We could fix those separatly but honestly I'm not we won't miss some. > * We do have a fair amount of CQL dtests and i've check all pass. The refactor also cleans up some part of the code quite a bit imo. So overall I think I'm almost more confident in the code post-refactor than the current one. > * We're early in 1.2 and it's an improvement after all. It would be a bit sad to have to wait for 2.0 to get this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira