cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oded Peer (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements
Date Mon, 13 Apr 2015 09:18:13 GMT


Oded Peer updated CASSANDRA-7304:
    Attachment: 7304-06.patch

Added a new patch 7304-06.patch

There are a few use cases that are not tested and that seems to not work properly:
List marker: SELECT * FROM %s WHERE k in ? or SELECT * FROM %s WHERE k = ? AND i IN ?

bq. Tuple marker: SELECT * FROM %s WHERE k = ? and (i, j) = ?

bq. Collection marker containing an unset value. For example: INSERT INTO %s (k, m) VALUES
(10, ?) where the value for the map m will be map("k", unset())
The value of the map entry can be an unset value only in testing and internal code, It can't
happen from client code.
"unset" variables only applies to bind variables in a CQL query. A CQL client can not create
an "unset" ByteBuffer as a map value since it is not a bound value.

bq. Queries with CONTAINS or CONTAINS KEY conditions
Added tests.

There is also a few use cases that are not tested and that I have not tried:
Secondary index queries on collection key or value with unset values
Added tests.

bq. UPDATE or DELETE queries with unset values in the WHERE clause
Added tests.

Nested tulpe with unset values
It looks like you missed the following remark from Sylvain
In ModificationStatement.executeInternal, the body of the for loop should just be replaced
by mutation.apply().
I didn't replace the for loop body since the {{apply()}} method is not in the {{IMutation}}
The {{apply()}} method signature is different in {{Mutation}} and {{CounterMutation}}, one
is {{void}} while the other returns a {{Mutation}} instance.
I chose to leave it as-is.

bq. I do not understand your change in FunctionCall. We cannot know if some function can accept
null or not as somebody can create a UDF for which null is a valid input. For unset value,
we need to block them in FunctionCall as the existing functions will break otherwise.
Since functions do not accept bind variables as input, only column identifiers, and A column
value can not be an unset value.
I added a comment to {{FunctionCall}} stating why there is no need in checking for unset variables
in functions.

The error messages for tuples and UDT do not provides enough information if you have multiple
of them in the query (e.g. 
SELECT * FROM myTable WHERE a = 0 AND (b, c) = (?, ?) AND (d, e) > (?, ?)). 
I am not sure how we could provide a better message but you might be able to find a way? At
least for UDT we should provide the type name in the error message.
I added more information to the bind marker position, and added a test.

bq. In Sets.Adder.doAdd, Lists.Appender.doAppend and Maps.Appender.doAppend there are some
unused variable.
bq. In Sets, Lists, Maps and Constants there are several place where you use some unecessary
else. The if either end by a return or by throwing an Exception.
I think it's a matter of taste. I changed it and remvoed the unecessary else.

bq. I would be in favor to put the tests in the corresponding unit tests rather than in a
new one. For example I will put the collection tests in CollectionsTest. I believe that it
will help people if all the tests for a collections for example are together. It can serve
as a form of documentation.
Done. I moved tests to CollectionsTest, TupleTypeTest, UserTypesTest.
bq. You should add the feature to the News.txt

bq. There are still a lot of whitespaces in your patch.

> Ability to distinguish between NULL and UNSET values in Prepared Statements
> ---------------------------------------------------------------------------
>                 Key: CASSANDRA-7304
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Drew Kutcharian
>            Assignee: Oded Peer
>              Labels: cql, protocolv4
>             Fix For: 3.0
>         Attachments: 7304-03.patch, 7304-04.patch, 7304-05.patch, 7304-06.patch, 7304-2.patch,
> Currently Cassandra inserts tombstones when a value of a column is bound to NULL in a
prepared statement. At higher insert rates managing all these tombstones becomes an unnecessary
overhead. This limits the usefulness of the prepared statements since developers have to either
create multiple prepared statements (each with a different combination of column names, which
at times is just unfeasible because of the sheer number of possible combinations) or fall
back to using regular (non-prepared) statements.
> This JIRA is here to explore the possibility of either:
> A. Have a flag on prepared statements that once set, tells Cassandra to ignore null columns
> or
> B. Have an "UNSET" value which makes Cassandra skip the null columns and not tombstone
> Basically, in the context of a prepared statement, a null value means delete, but we
don’t have anything that means "ignore" (besides creating a new prepared statement without
the ignored column).
> Please refer to the original conversation on DataStax Java Driver mailing list for more
> *EDIT 18/12/14 - [~odpeer] Implementation Notes:*
> The motivation hasn't changed.
> Protocol version 4 specifies that bind variables do not require having a value when executing
a statement. Bind variables without a value are called 'unset'. The 'unset' bind variable
is serialized as the int value '-2' without following bytes.
> \\
> \\
> * An unset bind variable in an EXECUTE or BATCH request
> ** On a {{value}} does not modify the value and does not create a tombstone
> ** On the {{ttl}} clause is treated as 'unlimited'
> ** On the {{timestamp}} clause is treated as 'now'
> ** On a map key or a list index throws {{InvalidRequestException}}
> ** On a {{counter}} increment or decrement operation does not change the counter value,
e.g. {{UPDATE my_tab SET c = c - ? WHERE k = 1}} does change the value of counter {{c}}
> ** On a tuple field or UDT field throws {{InvalidRequestException}}
> * An unset bind variable in a QUERY request
> ** On a partition column, clustering column or index column in the {{WHERE}} clause throws
> ** On the {{limit}} clause is treated as 'unlimited'

This message was sent by Atlassian JIRA

View raw message