cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8831) Create a system table to expose prepared statements
Date Mon, 23 Feb 2015 02:13:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14332486#comment-14332486
] 

Sylvain Lebresne commented on CASSANDRA-8831:
---------------------------------------------

bq. I implemented it as "persists changes to the table every minute".

I'd *really* prefer persisting statements when we prepare them (and remove them when they
get evicted from the cache) instead of have a scheduled task. Not doing so fails user expactions
for no good reason that I can see. In fact, since prepared statement are by design supposed
to be prepared very rarely, writing them all every minute is a waste of good CPU time. I'll
note that while we could do it asynchronously (but still only trigget it once when the statement
is prepared), I really don't think it's worth adding complexity because statement preparation
is not terribly performance sensitive and besides, provided we don't force-flush the table
(which there is no reason to), the write will be pretty cheap (it's really just an almost
surely uncontended write to the memtable).

bq. Table has the layout you've proposed

Well, almost, your patch don't include the {{table_name}} in the partition key. And unless
you had a reason not to, I think it's a nice to have.

bq. except that I had to add the column {{use_keyspace_name}}

I don't think we need that, once we've parsed the statement we know to which keyspace it applies
without ambiguity. Or to put it another way, the only times
{{use_keyspace_name}} would differ from {{keyspace}} is when the former *do not* influence
the prepared statement and it's thus rather useless.

Other minor remarks on the patch:
* the code to actually query the table should go into SystemKeyspace for symetry with every
other code that query a system table.
* let's maybe add a {{keyspace()}} and {{table()}} (which can be null) methods to {{CQLStatement}}
rather than doing ugly instanceof tests.

bq. Just making a note that this does have security implications

Yes, but given the table layout, whatever mechanism will work for CASSANDRA-8163 will work
here to (I even think that having a DESCRIBE permission applying to this is actually fair).
Meaning that I don't reject your remark, but in our current state I don't think being able
to list prepared statement is really a lot worth than being able to list tables (yes they
could be scalar in prepared statements but I'm not sure this changes everything either) and
so I feel this should just be added to the list of things to handle in CASSANDRA-8163 but
not be a blocker per-se.

> Create a system table to expose prepared statements
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8831
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8831
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Robert Stupp
>         Attachments: 8831-v1.txt
>
>
> Because drivers abstract from users the handling of up/down nodes, they have to deal
with the fact that when a node is restarted (or join), it won't know any prepared statement.
Drivers could somewhat ignore that problem and wait for a query to return an error (that the
statement is unknown by the node) to re-prepare the query on that node, but it's relatively
inefficient because every time a node comes back up, you'll get bad latency spikes due to
some queries first failing, then being re-prepared and then only being executed. So instead,
drivers (at least the java driver but I believe others do as well) pro-actively re-prepare
statements when a node comes up. It solves the latency problem, but currently every driver
instance blindly re-prepare all statements, meaning that in a large cluster with many clients
there is a lot of duplication of work (it would be enough for a single client to prepare the
statements) and a bigger than necessary load on the node that started.
> An idea to solve this it to have a (cheap) way for clients to check if some statements
are prepared on the node. There is different options to provide that but what I'd suggest
is to add a system table to expose the (cached) prepared statements because:
> # it's reasonably straightforward to implement: we just add a line to the table when
a statement is prepared and remove it when it's evicted (we already have eviction listeners).
We'd also truncate the table on startup but that's easy enough). We can even switch it to
a "virtual table" if/when CASSANDRA-7622 lands but it's trivial to do with a normal table
in the meantime.
> # it doesn't require a change to the protocol or something like that. It could even be
done in 2.1 if we wish to.
> # exposing prepared statements feels like a genuinely useful information to have (outside
of the problem exposed here that is), if only for debugging/educational purposes.
> The exposed table could look something like:
> {noformat}
> CREATE TABLE system.prepared_statements (
>    keyspace_name text,
>    table_name text,
>    prepared_id blob,
>    query_string text,
>    PRIMARY KEY (keyspace_name, table_name, prepared_id)
> )
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message