Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 28878 invoked from network); 6 Apr 2010 21:13:35 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 Apr 2010 21:13:35 -0000 Received: (qmail 2186 invoked by uid 500); 6 Apr 2010 21:13:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 2160 invoked by uid 500); 6 Apr 2010 21:13:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 2152 invoked by uid 99); 6 Apr 2010 21:13:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Apr 2010 21:13:35 +0000 X-ASF-Spam-Status: No, hits=-0.2 required=10.0 tests=AWL,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sjh_cassandra@shic.co.uk designates 62.89.140.53 as permitted sender) Received: from [62.89.140.53] (HELO smtp.hotchilli.net) (62.89.140.53) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Apr 2010 21:13:27 +0000 Received: from static-87-243-200-80.adsl.hotchilli.net ([87.243.200.80] helo=[10.0.1.253]) by smtp.hotchilli.net with esmtp (Exim 4.63) (envelope-from ) id 1NzG53-0003nu-Ls for user@cassandra.apache.org; Tue, 06 Apr 2010 22:13:05 +0100 Message-ID: <4BBBA3DB.3030804@shic.co.uk> Date: Tue, 06 Apr 2010 22:12:59 +0100 From: Steve User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9) Gecko/20100317 Lightning/1.0b1 Thunderbird/3.0.4 ThunderBrowse/3.2.8.1 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: A question of 'referential integrity'... References: <4BBB1431.5070408@shic.co.uk> <1270564013.8771.157.camel@erebus.lan> <4BBB6B71.7010000@shic.co.uk> <4BBB81D9.4000502@shic.co.uk> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 06/04/2010 21:40, Benjamin Black wrote: > I suggest the reasons you list (which are certainly great reasons!) > are also the reasons there is no referential integrity or transaction > support. Quite. I'm not trying to make recommendations for how Cassandra should be changed to be more like a traditional RDBMS... I just have a requirement, at the logical level, that would be trivial with traditional technology - so the analogy seemed an ideal way to illustrate the issue. > It seems the common practice of using a system like > Zookeeper for the synchronization parts alongside Cassandra would be > applicable here. Have you investigated that? > I started looking at Zookeeper when it was mentioned in an earlier reply. I've discovered it supports something called "Ledgers" - but I'm still unclear if they'd be useful to me - I've only uncovered a very high-level overview so far. I'm concerned that Zookeeper looks as if it might become a problematic bottleneck if all the updates must be routed through it. I don't see Zookeeper mutexes as being especially helpful... because my problem isn't really about two incompatible requests in quick succession - but, rather, about needing to ensure that "referential integrity" is eventually established between two, otherwise independent, keysets. I need to eliminate the possibility that I end up with 'dangling' inaccessible data should a hash-value become recorded in the range of the first map but not the domain of the second (or vice-versa.) Should I assume that it isn't common practice to write updates atomically in-real time, and batch process them 'off-line' to increase the atomic granularity? It seems an obvious strategy... possibly one for which an implementation might use "MapReduce" or something similar? I don't want to re-invent the wheel, of course.