cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10786) Include hash of result set metadata in prepared statement id
Date Fri, 03 Jun 2016 21:14:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314827#comment-15314827
] 

Tyler Hobbs commented on CASSANDRA-10786:
-----------------------------------------

bq. For example, when the table gets altered and the statements get re-prepared under the
hood, all unaffected queries will not need to go through the long re-prepare path, they'll
be able to submit their queries, and if metadata doesn't indicate any change, it'll just get
the results.

That's a good point.   Unfortunately, server-side we invalidate all related prepared statements
whenever a schema change occurs, so there will still be some amount of full repreparation
until the v5 protocol is the minimum supported protocol version and we can change this server-side
behavior.  Of course, the primary motivation for this ticket was to handle clients that use
the prepared statement after it's been re-prepared (if the statement ID hasn't changed), and
in this case that should work out nicely.  To summarize, the first client to use a prepared
statement after a schema change will still have to reprepare, but later clients will not.

bq.  Same question on the driver side: do we support "all" versions simultaneously?

Server-side, we need to support v3 and up until Cassandra 4.0.  This is because Cassandra
2.1 supports at most v3, and we support direct upgrades from 2.1 to 3.x.  So, all of v3, v4,
and v5 need to be supported simultaneously.  This means we will need to change the query preparation
behavior depending on what protocol version the connection is using.

On the driver side, they are free to support whatever protocol versions they want. Most will
probably also support v3 through v5, but some drivers may lag behind and only support v3 or
v3 and v4 for a while.

bq. Is there any test matrix for that?

The drivers generally implement parts of that test matrix.  On the Cassandra side, we sometimes
do targeted testing of small parts of the native protocol, like in {{ProtocolErrorTest}}.

bq. I've checked both Java and Python driver, and in both cases these IDs are directly available
there, since the metadata might get skipped and driver needs a way to query it, so it's more
or less always around.

[~avalanche123] can you see if the other drivers would have any problem with omitting the
statement ID in the special response?

> Include hash of result set metadata in prepared statement id
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-10786
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10786
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL
>            Reporter: Olivier Michallat
>            Assignee: Alex Petrov
>            Priority: Minor
>              Labels: client-impacting, doc-impacting, protocolv5
>             Fix For: 3.x
>
>
> This is a follow-up to CASSANDRA-7910, which was about invalidating a prepared statement
when the table is altered, to force clients to update their local copy of the metadata.
> There's still an issue if multiple clients are connected to the same host. The first
client to execute the query after the cache was invalidated will receive an UNPREPARED response,
re-prepare, and update its local metadata. But other clients might miss it entirely (the MD5
hasn't changed), and they will keep using their old metadata. For example:
> # {{SELECT * ...}} statement is prepared in Cassandra with md5 abc123, clientA and clientB
both have a cache of the metadata (columns b and c) locally
> # column a gets added to the table, C* invalidates its cache entry
> # clientA sends an EXECUTE request for md5 abc123, gets UNPREPARED response, re-prepares
on the fly and updates its local metadata to (a, b, c)
> # prepared statement is now in C*’s cache again, with the same md5 abc123
> # clientB sends an EXECUTE request for id abc123. Because the cache has been populated
again, the query succeeds. But clientB still has not updated its metadata, it’s still (b,c)
> One solution that was suggested is to include a hash of the result set metadata in the
md5. This way the md5 would change at step 3, and any client using the old md5 would get an
UNPREPARED, regardless of whether another client already reprepared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message