cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Constance Eustace (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-9886) TIMESTAMP - allow USING TIMESTAMP at end of mutation CQL
Date Mon, 27 Jul 2015 16:33:05 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Constance Eustace updated CASSANDRA-9886:
-----------------------------------------
    Description: 
I was doing performance testing to get off of using batches for our persistence engine, and
instead use "async spray" with timestamps. 

First of all, it seems fairly insane that the USING TIMESTAMP clause is in a different location
for INSERT (before WHERE) and the UPDATE (before SET)  and the DELETE (before WHERE) statements...
thus is in the middle of the statement for no real apparently good reason, although maybe
there is some PostGresql compatibility. 

This means that if some code produces a large list of statements without the USING TIMESTAMP
already in it, because the actual method of execution of a list of statements, which may use
batches (if we were grouping by partition key) or not (single statement) may be determined
later...

Then for single statement updates, the statement needs  to properly place the USING TIMESTAMP
clause. It would be MUCH EASIER to add a simple append of "USING TIMESTAMP xxx" at the end
of the CQL statement.

BATCH is easier, you just wrap the statements. Pretty basic.

I have done performance testing with single-statement BATCH USING TIMESTAMP and their performance
is awful, worse that "NEVER EVER DO THIS" sync batches with cross-partition updates.

Can we either allow a USING TIMESTAMP to be at the end of all the mutation statements in the
same place, or have a check in the BATCH statement processing to check if its a single statement
and reduce it to non-batch execution?

  was:
I was doing performance testing to get off of using batches for our persistence engine, and
instead use "async spray" with timestamps. 

First of all, it seems fairly insane that the USING TIMESTAMP clause is in a different location
for INSERT (before WHERE) and the UPDATE (before SET)  and the DELETE (before WHERE) statements...
thus is in the middle of the statement for no real apparently good reason, although maybe
there is some PostGresql compatibility. 

This means that if some code produces a large list of statements without the USING TIMESTAMP
already in it, because the actual method of execution of a list of statements, which may use
batches (if we were grouping by partition key) or not (single statement) may be determined
later...

Then for single statement updates, the statement needs  to properly place the USING TIMESTAMP
clause. It would be MUCH EASIER to all a simple append of "USING TIMESTAMP xxx" at the end
of the CQL statement.

BATCH is easier, you just wrap the statements. Pretty basic.

I have done performance testing with single-statement BATCH USING TIMESTAMP and their performance
is awful, worse that "NEVER EVER DO THIS" sync batches with cross-partition updates.

Can we either allow a USING TIMESTAMP to be at the end of all the mutation statements in the
same place, or have a check in the BATCH statement processing to check if its a single statement
and reduce it to non-batch execution?


> TIMESTAMP - allow USING TIMESTAMP at end of mutation CQL 
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-9886
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9886
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Constance Eustace
>             Fix For: 2.1.x
>
>
> I was doing performance testing to get off of using batches for our persistence engine,
and instead use "async spray" with timestamps. 
> First of all, it seems fairly insane that the USING TIMESTAMP clause is in a different
location for INSERT (before WHERE) and the UPDATE (before SET)  and the DELETE (before WHERE)
statements... thus is in the middle of the statement for no real apparently good reason, although
maybe there is some PostGresql compatibility. 
> This means that if some code produces a large list of statements without the USING TIMESTAMP
already in it, because the actual method of execution of a list of statements, which may use
batches (if we were grouping by partition key) or not (single statement) may be determined
later...
> Then for single statement updates, the statement needs  to properly place the USING TIMESTAMP
clause. It would be MUCH EASIER to add a simple append of "USING TIMESTAMP xxx" at the end
of the CQL statement.
> BATCH is easier, you just wrap the statements. Pretty basic.
> I have done performance testing with single-statement BATCH USING TIMESTAMP and their
performance is awful, worse that "NEVER EVER DO THIS" sync batches with cross-partition updates.
> Can we either allow a USING TIMESTAMP to be at the end of all the mutation statements
in the same place, or have a check in the BATCH statement processing to check if its a single
statement and reduce it to non-batch execution?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message