cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Bailis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7056) Add RAMP transactions
Date Sat, 20 Sep 2014 23:08:36 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142227#comment-14142227
] 

Peter Bailis commented on CASSANDRA-7056:
-----------------------------------------

bq. Let's assume we query from partition A and B, and we see the results don't match timestamps,
we would pull the latest batchlog assuming they are from the same batch but let's say they
in fact are not. In this case we wasted a lot of time so my question is should we only do
this in the user supplies a new CL type?

If you set the same, unique (e.g., UUID) write timestamp for all writes in a batch, then you
know that any results with different timestamps  are part of different batches. So, given
mismatched timestamps, should you check the batchlog for pending writes? One solution is to
always check (as in RAMP-Small). This doesn't require any extra metadata, but, as you point
out, also requires 2 RTTs. To cut down on these RTTs, you could also do attach a Bloom filter
of the items in each batch and only check any possibly missing writes (as in RAMP-Hybrid).
(I can go into more detail if you want.) However, I agree that you might not want to pay these
costs *all* of the time for reads. Would a BATCH_READ or other modifier to CQL SELECT statements
make sense?

bq. In the case of a global index we plan on reading the data after reading the index. The
data query might reveal the indexed value is stale. We would need to apply the batchlog and
fix the index, would we then restart the entire query? or maybe overquery assuming some index
values will be stale? Either way this query looks different than the above scenario.

I think there are a few options. The easiest is to simply filter out the out of date rows,
and then you are guaranteed to see a subset of the index entries. Alternatively, you could
provide a "snapshot index read" where you read the older, overwritten values from the data
node. If you want a "read latest and read snapshot" mode, there are some options I can describe,
but they generally entail either more metadata or, otherwise, using locks/blocking coordination,
which I don't think you want.


> Add RAMP transactions
> ---------------------
>
>                 Key: CASSANDRA-7056
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>            Reporter: Tupshin Harper
>            Priority: Minor
>
> We should take a look at [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/]
transactions, and figure out if they can be used to provide more efficient LWT (or LWT-like)
operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message