cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-10786) Include hash of result set metadata in prepared statement id
Date Fri, 03 Jun 2016 13:56:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314138#comment-15314138
] 

Alex Petrov edited comment on CASSANDRA-10786 at 6/3/16 1:56 PM:
-----------------------------------------------------------------

This feature might actually be much more useful than it initially appeared to me, although
it only occurred to me during the implementation. For example, when the table gets altered
and the statements get re-prepared under the hood, all unaffected queries will not need to
go through the long re-prepare path, they'll be able to submit their queries, and if metadata
doesn't indicate any change, it'll just get the results.

I'm mostly done with the patch for both server and client, will think about possible corner
cases I didn't consider and submit it on Monday if all goes right.

I just have a couple of question on the {{v4}} vs {{v5}} protocol change. Cassandra server
would work with older ({{v4}} version as well as with {{v5}}). We're just using different
invalidation strategy and have very small changes on the protocol side (new flag and re-sent
metadata). Flag would still be sent (of course, changing it is trivial), I'm just wondering
if we usually do so, since flags do not have any overlap there. Same question on the driver
side: do we support "all" versions simultaneously? Is there any test matrix for that?
 
One more thing: in the current implementation I have omitted the prepared statement id in
the {{Rows}} response. I've checked both Java and Python driver, and in both cases these IDs
are directly available there, since the metadata might get skipped and driver needs a way
to query it, so it's more or less always around. I think we can safely skip it there.


was (Author: ifesdjeen):
This feature might actually be much more useful than it initially appeared to me, although
it only occurred to me during the implementation. For example, when the table gets altered
and the statements get re-prepared under the hood, all unaffected queries will not need to
go through the long re-prepare path, they'll be able to submit their queries, and if metadata
doesn't indicate any change, it'll just get the results.

I'm mostly done with the patch for both server and client, will think about possible corner
cases I didn't consider and submit it on Monday if all goes right.

I just have a couple of question on the {{v4}} vs {{v5}} protocol change. Cassandra server
would work with older ({{v4}} version as well as with {{v5}}). We're just using different
invalidation strategy and have very small changes on the protocol side (new flag and re-sent
metadata). Flag would still be sent (of course, changing it is trivial), I'm just wondering
if we usually do so, since flags do not have any overlap there. Same question on the driver
side: do we support "all" versions simultaneously? Is there any test matrix for that?

> Include hash of result set metadata in prepared statement id
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-10786
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10786
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL
>            Reporter: Olivier Michallat
>            Assignee: Alex Petrov
>            Priority: Minor
>              Labels: client-impacting, doc-impacting, protocolv5
>             Fix For: 3.x
>
>
> This is a follow-up to CASSANDRA-7910, which was about invalidating a prepared statement
when the table is altered, to force clients to update their local copy of the metadata.
> There's still an issue if multiple clients are connected to the same host. The first
client to execute the query after the cache was invalidated will receive an UNPREPARED response,
re-prepare, and update its local metadata. But other clients might miss it entirely (the MD5
hasn't changed), and they will keep using their old metadata. For example:
> # {{SELECT * ...}} statement is prepared in Cassandra with md5 abc123, clientA and clientB
both have a cache of the metadata (columns b and c) locally
> # column a gets added to the table, C* invalidates its cache entry
> # clientA sends an EXECUTE request for md5 abc123, gets UNPREPARED response, re-prepares
on the fly and updates its local metadata to (a, b, c)
> # prepared statement is now in C*’s cache again, with the same md5 abc123
> # clientB sends an EXECUTE request for id abc123. Because the cache has been populated
again, the query succeeds. But clientB still has not updated its metadata, it’s still (b,c)
> One solution that was suggested is to include a hash of the result set metadata in the
md5. This way the md5 would change at step 3, and any client using the old md5 would get an
UNPREPARED, regardless of whether another client already reprepared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message