Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B1820106F6 for ; Fri, 11 Oct 2013 07:10:45 +0000 (UTC) Received: (qmail 47352 invoked by uid 500); 11 Oct 2013 07:10:45 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 47152 invoked by uid 500); 11 Oct 2013 07:10:44 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 47141 invoked by uid 99); 11 Oct 2013 07:10:42 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Oct 2013 07:10:42 +0000 Date: Fri, 11 Oct 2013 07:10:42 +0000 (UTC) From: "Sylvain Lebresne (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-6178) Consider allowing timestamp at the protocol level ... and deprecating server side timestamps MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792421#comment-13792421 ] Sylvain Lebresne commented on CASSANDRA-6178: --------------------------------------------- bq. sounds like a lot like "I want read-my-writes consistency" which is also only sane if restricted to a single session (connection) I'm not totally sure I understand what you mean here, but I'm sure I disagree with that statement if session == server connection. Because if you restrict that to one server connection, that would mean we don't guarantee "read-my-writes consistency" in face of a server failure, while we do, and that's the important part. Besides, neither Hector nor Astyanax (to cite only java driver) maps a client thread/session to a unique server connection in general, so if "read-my-writes consistency" was only sane for one server connection, none of those client would ever guarantee it, and they do. So back to the issue at hand, I agree that "I want my operations to be sequential wrt to the order the client issued them" is only sane if restricted to one client thread/session, but I'm saying that with server side timestamp we do not guarantee that today since no serious client driver I know of maps a client thread to a unique server connection at all time (for very good reasons). To be very concrete, we've had already 3 reports on the java driver (and some reports on the pythone one also) of people running as simple a test as: {noformat} session.execute(new SimpleStatement("INSERT INTO test (k, v) VALUES (0, 1)").setConsistencyLevel(ConsistencyLevel.ALL)); session.execute(new SimpleStatement("INSERT INTO test (k, v) VALUES (0, 2)").setConsistencyLevel(ConsistencyLevel.ALL)); {noformat} and being surprised that at the end, the value was sometimes 2, but sometimes 1. While this behavior can be explained by the fact that the timestamp are only assigned server side and that both queries might not reach the same coordinator, I have a very hard time considering this as a ok "default" behavior and I'm pretty sure any new user would consider that as a break of the consistency guarantees. And while I'd agree that inserting a value and overriding it right away is not too useful in real life, that's still something easy to run by when you're testing C* to try to understand the consistency guarantee it provides. > Consider allowing timestamp at the protocol level ... and deprecating server side timestamps > -------------------------------------------------------------------------------------------- > > Key: CASSANDRA-6178 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6178 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > > Generating timestamps server side by default for CQL has been done for convenience, so that end-user don't have to provide one with every query. However, doing it server side has the downside that updates made sequentially by one single client (thread) are no guaranteed to have sequentially increasing timestamps. Unless a client thread is always pinned to one specific server connection that is, but no good client driver out (that is, including thrit driver) there does that because that's contradictory to abstracting fault tolerance to the driver user (and goes again most sane load balancing strategy). > Very concretely, this means that if you write a very trivial test program that sequentially insert a value and then erase it (or overwrite it), then, if you let CQL pick timestamp server side, the deletion might not erase the just inserted value (because the delete might reach a different coordinator than the insert and thus get a lower timestamp). From the user point of view, this is a very confusing behavior, and understandably so: if timestamps are optional, you'd hope that they are least respect the sequentiality of operation from a single client thread. > Of course we do support client-side assigned timestamps so it's not like the test above is not fixable. And you could argue that's it's not a bug per-se. Still, it's a very confusing "default" behavior for something very simple, which suggest it's not the best default. > You could also argue that inserting a value and deleting/overwriting right away (in the same thread) is not something real program often do. And indeed, it's likely that in practice server-side timestamps work fine for most real application. Still, it's too easy to get counter-intuitive behavior with server-side timestamps and I think we should consider moving away from them. > So what I'd suggest is that we push back the job of providing timestamp client side. But to make it easy for the driver to generate it (rather than the end user), we should allow providing said timestamp at the protocol level. > As a side note, letting the client provide the timestamp would also have the advantage of making it easy for the driver to retry failed operations with their initial timestamp, so that retries are truly idempotent. -- This message was sent by Atlassian JIRA (v6.1#6144)