cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Stupp (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8831) Create a system table to expose prepared statements
Date Sat, 21 Feb 2015 10:48:11 GMT


Robert Stupp commented on CASSANDRA-8831:

Fair enough. So if I understand it correctly:
- Newly prepared statements are written into {{system.prepared_statements}} and removed upon
some _certain condition_: Is that when the pstmt is evicted or the pstmt has timed out (e.g.
TTL'd)? I assume it won't stay in {{system.prepared_statements}} until table or ks is dropped
- When a node starts up, it prepares all statements in {{system.prepared_statements}}
- When a client detects a node _restart_, it basically performs {{select keyspace_name, table_name,
prepared_id from system.prepared_statements}} and checks for missing prepared statements.
- In the ideal case, there are no statements to re-prepare

I'd like to propose to write that table asynchronously - i.e. synchronize the contents of
{{QueryProcessor.preparedStatements}} every N minutes so we don't add any overhead to CQL
processing at all. That way we wouldn't have to deal with TTL or _synchronization of table
with pstmt-eviction_ as described above.

Follow-up: we could add some more intelligent logic that allows a newly bootstrapped node
to get a view of probably required prepared statements - so clients would not have to hammer
a new node with prepare-statement-calls (not to be included in this ticket).

I assume, that we don't want to do that for {{QueryProcessor.thriftPreparedStatements}}.

(With _neighbors_ I meant neighbors in the _same DC_ - yes, depends on the node cfg, KS cfg
and app)

> Create a system table to expose prepared statements
> ---------------------------------------------------
>                 Key: CASSANDRA-8831
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
> Because drivers abstract from users the handling of up/down nodes, they have to deal
with the fact that when a node is restarted (or join), it won't know any prepared statement.
Drivers could somewhat ignore that problem and wait for a query to return an error (that the
statement is unknown by the node) to re-prepare the query on that node, but it's relatively
inefficient because every time a node comes back up, you'll get bad latency spikes due to
some queries first failing, then being re-prepared and then only being executed. So instead,
drivers (at least the java driver but I believe others do as well) pro-actively re-prepare
statements when a node comes up. It solves the latency problem, but currently every driver
instance blindly re-prepare all statements, meaning that in a large cluster with many clients
there is a lot of duplication of work (it would be enough for a single client to prepare the
statements) and a bigger than necessary load on the node that started.
> An idea to solve this it to have a (cheap) way for clients to check if some statements
are prepared on the node. There is different options to provide that but what I'd suggest
is to add a system table to expose the (cached) prepared statements because:
> # it's reasonably straightforward to implement: we just add a line to the table when
a statement is prepared and remove it when it's evicted (we already have eviction listeners).
We'd also truncate the table on startup but that's easy enough). We can even switch it to
a "virtual table" if/when CASSANDRA-7622 lands but it's trivial to do with a normal table
in the meantime.
> # it doesn't require a change to the protocol or something like that. It could even be
done in 2.1 if we wish to.
> # exposing prepared statements feels like a genuinely useful information to have (outside
of the problem exposed here that is), if only for debugging/educational purposes.
> The exposed table could look something like:
> {noformat}
> CREATE TABLE system.prepared_statements (
>    keyspace_name text,
>    table_name text,
>    prepared_id blob,
>    query_string text,
>    PRIMARY KEY (keyspace_name, table_name, prepared_id)
> )
> {noformat}

This message was sent by Atlassian JIRA

View raw message