Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 95229 invoked from network); 25 Oct 2010 23:26:35 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 25 Oct 2010 23:26:35 -0000 Received: (qmail 33745 invoked by uid 500); 25 Oct 2010 23:26:33 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 33721 invoked by uid 500); 25 Oct 2010 23:26:33 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 33713 invoked by uid 99); 25 Oct 2010 23:26:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Oct 2010 23:26:33 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jverstry@gmail.com designates 209.85.215.172 as permitted sender) Received: from [209.85.215.172] (HELO mail-ey0-f172.google.com) (209.85.215.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Oct 2010 23:26:24 +0000 Received: by eyd10 with SMTP id 10so1954334eyd.31 for ; Mon, 25 Oct 2010 16:26:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=Sl5xeYrGuvsDlYRHl5A4Pih0P0dn/ROlLPROywsUGpY=; b=RDZAN3smDBgKyf1AAUTda1C4djaFCAIJ7hNdsYYoDiG8LQnY9SYe5THvz3xQVM1zGo 9QKQ1z1vgRmrAZfLvCiiZZfER/NOxz6y7mLv/sOiMG3NPAJbJ6vaM68eixgRYlKhuGXV NDfKAxPbowB6fwwqmuhyJLDXCxws1zveSN1Rk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=wz7CER+ScOjgxQE+RKmddOQQXnVDLdoeQCkt1M7DAOYFglOtUcwwzJmJ6+W3D3bf6e G2FKbpk06lPJ4iWvdIPZkjr4JU5Hbm+3tjKQaELh0NzkKuNwFSXhiUPznbDDoscnVsdM 8POoKk8u8O+hGf3PT8YZ6gvf7YDRft/yHj/NQ= Received: by 10.14.48.2 with SMTP id u2mr5575324eeb.9.1288049163968; Mon, 25 Oct 2010 16:26:03 -0700 (PDT) Received: from [192.168.1.64] ([92.65.215.33]) by mx.google.com with ESMTPS id v51sm3658558eeh.4.2010.10.25.16.26.02 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 25 Oct 2010 16:26:02 -0700 (PDT) Message-ID: <4CC611DE.1010305@gmail.com> Date: Tue, 26 Oct 2010 01:25:18 +0200 From: =?UTF-8?B?SsOpcsO0bWUgVmVyc3RyeW5nZQ==?= User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-GB; rv:1.9.2.11) Gecko/20101013 Thunderbird/3.1.5 MIME-Version: 1.0 To: Peter Schuller CC: user@cassandra.apache.org Subject: Re: What happens if there is a collision? References: <4CBF99A8.7060304@dawningstreams.com> <4CBFB04E.6090406@gmail.com> <4CC08D51.9080405@gmail.com> <4CC0CFA2.8020901@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Peter, thanks for extensive feedback. Much appreciated. On 26/10/2010 0:47, Peter Schuller wrote: > This doesn't mean that your problem is somehow invalid; but it doesn't > sound like QUOROM consistency (over-writing) writes is the solution. > What is the difference, from your application's perspective, between > the timestamp tie and a write simply happening a millisecond later by > an un-coordinated concurrent writer? In both cases, the data in > cassandra will no longer match your client's view of it. I may have been unclear about the meaning of timestamp in Cassandra. I was under the impression that any given data with the same key value and two different timestamps would result in two 'rows'. From what you say, it does not seem to be the case. Do you confirm? (In other words, whoever has the greatest timestamp destroys the previous records with lower timestamps). > I'm repeating myself but just to be clear: So again, it seems to me > such an ACK would not be useful since you would not be made aware of > any change that happens later on anyway. It does not seem semantically > "relevant" except perhaps as a probabilistic optimization. As soon as > your write completes, you have no idea what is in Cassandra, > regardless of timestamp ties (assuming you have the potential for > concurrent writers). Assuming latest timestamp erase/overwrites previous entries, I agree. >> If 'value breaks timestamp-tie', how does Cassandra behave in case of >> updates? If there is a column with value 'AAA' at 334450 ms and an >> application explicitely wants to update this value to 'ZZZ' for 334450 ms, >> it seems like the timestamp-tie will prevent that. Hence, the >> update/mutation would be undeterministic to E. It seems like one should >> first delete the existing record and write a new one (and that could lead to >> race conditions and timestamp-ties too). > A single client wishing to make multiple logically subsequent writes > should ensure that the same timestamp is not used for such writes. Make sense if latest timestamp erases/overwrittes previous data. >> I think this should be documented, because engineers will hit that 'local' >> undeterministic issue for sure if two instances of their applications >> perform 'completed writes' in the same column family. Completed does not >> mean successful, even with quorum (or ALL). They ought to know it. > I think it does. I believe the results you are describing as > unexpected are fully expected fundamentally, and there is no real > difference implied in receiving a timestamp ACK flag back. I'm totally > open to being wrong or having misunderstood something (or both), but > right now I don't see it. If on the other hand I'm not wrong then > perhaps we can figure out how to document or present the functionality > of Cassandra better :) I know I am boxing a corner case, but I have not seen in the documentation that latest timestamp erases/overwrittes previous data. Now, I may have missed something here. May be I did not rub my eyes enough or the coffee was not operating yet. If not, I would suggest adding some small documentation on the wiki explaining: i) That most recent timestamp overwrittes previous entries with lower timestamp. ii) If case of timestamp ties, value breaks ties. iii) What about ColumnFamilies and SuperColumnFamilies? Do we have the guarantee that, in case of timestamp ties, the whole record of the winner is register (I would assume yes, of course) I believe something 'official' and explicit from Cassandra leaders would close gap on assumptions and interpretations made by newbies like me. Timestamp really looks like a 'key' to me. Thanks, Jérôme