cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4285) Atomic, eventually-consistent batches
Date Mon, 28 May 2012 16:32:23 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284462#comment-13284462
] 

Jonathan Ellis commented on CASSANDRA-4285:
-------------------------------------------

In the trigger discussion (starting [here|https://issues.apache.org/jira/browse/CASSANDRA-1311?focusedCommentId=13137492&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13137492]
and continuing [here|https://issues.apache.org/jira/browse/CASSANDRA-1311?focusedCommentId=13245418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13245418])
I proposed a distributed commitlog that the coordinator could use to retry partially successful
batches.

To flesh that out a bit, I think there are two approaches we can take here:
# the distributed commitlog ("CSCL") approach discussed in the trigger ticket.  This is definitely
more complex assuming DCL RF > 1, since a write to any DCL replica succeeds, the commitlog
write can be considered successful -- and timeout still means "we don't know, retry." Then
you have the complexity of DCL replay -- sort of like hint handoff, in reverse, in the sense
that we need data from other nodes that may or may not be up at the same time as us -- and
of course it's going to basically halve write performance.
# a local-only approach, where batches are written to a non-replicated system CF the way hints
are now.  This would provide adequate durability when we can rely on Raid1/Raid10 local disks;
we don't need to worry about preserving this data indefinitely, after all; only until it's
persisted to the other replicas.  However, this is a non-starter for cloud environments where
the provider will just nuke VMs out from under you if there's a problem, and even for non-cloud
environments many prefer to deploy on Raid0 instead of paying the space overhead for Raid10.

So I think we should
- Start with the distributed commitlog since it is more generally useful, but
- Make batch atomicity optional, so users who don't need it don't pay any performance penalty
over what we have now
                
> Atomic, eventually-consistent batches
> -------------------------------------
>
>                 Key: CASSANDRA-4285
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4285
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>
> I discussed this in the context of triggers (CASSANDRA-1311) but it's useful as a standalone
feature as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message