cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-4835) Appending/Prepending items to list using BATCH
Date Wed, 24 Oct 2012 08:50:12 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-4835:
----------------------------------------

    Attachment: 0002-Ensure-same-timestamp-in-batches.txt
                0001-Fix-prepends-within-same-millis.txt

Alright, this is in fact a legit bug in prepend and is not specific to batches (though it's
probably harder to reproduce without them). Basically the logic in prepend to make sure we
were always generating a decreasing keys even in the same millisecond was broken. It was working
only for the same update, but was broke for successive update in the same millisecond. Patch
attached to fix that.

That being said, I do think that people should be very careful in assuming that statements
in a batch are applied in order *even within the same row* because that's just not true in
general. Batch applies everything "at the same time".  So for instance:
{noformat}
BEGIN BATCH
  UPDATE user SET name = 'Goo' WHERE userid = 1;
  UPDATE user SET name = 'Foo' WHERE userid = 1;
APPLY BATCH
{noformat}
will always (that's not quite true currently, see below) end up setting 'Goo' as the name
because the way the reconciliation rules work, the biggest value wins for equal timestamp.
Similarly,
{noformat}
BEGIN BATCH
  DELETE FROM user WHERE userid = 1;
  UPDATE user SET name = 'Foo' WHERE userid = 1;
APPLY BATCH
{noformat}
will always (again, see below) end up with the user deleted because on timestamp ties, tombstone
wins.

In other words, there was indeed a bug with prepend, and append/prepend do respect the order
in batches within the same partition key because we happen to process the statements of a
batch in order and there is no good reason to do otherwise, but I don't think we should make
that a guarantee either (as in, it's true now, it could change tomorrow, it's an implementation
detail). And so user shouldn't rely on it, and if the order is important, they should combine
into one statement.

Now, it is unrelated to lists, but when I said that
{noformat}
BEGIN BATCH
  UPDATE user SET name = 'Goo' WHERE userid = 1;
  UPDATE user SET name = 'Foo' WHERE userid = 1;
APPLY BATCH
{noformat}
will always end up with 'Goo', it's not quite true currently, because batches don't guarantee
that all update will use the same timestamp (in other words, the result of the batch above
randomly depends of the timing of the operation).  I think that *that* is a guarantee we should
provide: that unless the timestamp is user provided, all statement of a batch uses the same
timestamp. I'm attaching a second patch that implements that.

                
> Appending/Prepending items to list using BATCH
> ----------------------------------------------
>
>                 Key: CASSANDRA-4835
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4835
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Krzysztof Cieslinski
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 1.2.0 beta 2
>
>         Attachments: 0001-Fix-prepends-within-same-millis.txt, 0002-Ensure-same-timestamp-in-batches.txt
>
>
> As I know, there is no any guarantee that commands that are inside BATCH block will execute
in same order, as they are stored in the BATCH block. But...
> I have made two tests:
> First appends some items to the empty list, and the second one, prepends items, also
to the empty list. Both of them are using UPDATE commands stored in the BATCH block. 
> Results of those tests are as follow:
> First:
>       When appending new items to list, USING commands are executed in the same order
as they are stored i BATCH.
> Second:
>       When prepending new items to list, USING commands are executed in random order.
 
> So, in other words below code:
> {code:xml}
> BEGIN BATCH
>  UPDATE... list_name = list_name + [ '1' ]  
>  UPDATE... list_name = list_name + [ '2' ]
>  UPDATE... list_name = list_name + [ '3' ] 
> APPLY BATCH;{code}
>  always results in [ '1', '2', '3' ],
>  but this code:
> {code:xml}
> BEGIN BATCH
>  UPDATE... list_name = [ '1' ] + list_name   
>  UPDATE... list_name = [ '2' ] + list_name
>  UPDATE... list_name = [ '3' ] + list_name
> APPLY BATCH;{code}
> results in randomly ordered list, like [ '2', '1', '3' ]    (expected result is [ '3',
'2', '1' ])
> So somehow, when appending items to list, commands from BATCH are executed in order as
they are stored, but when prepending, the order is random.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message