Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C7B4E7EBA for ; Sun, 13 Nov 2011 14:52:52 +0000 (UTC) Received: (qmail 36240 invoked by uid 500); 13 Nov 2011 14:52:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 36178 invoked by uid 500); 13 Nov 2011 14:52:50 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 36170 invoked by uid 99); 13 Nov 2011 14:52:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Nov 2011 14:52:50 +0000 X-ASF-Spam-Status: No, hits=-0.6 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dnd1066@gmail.com designates 74.125.82.172 as permitted sender) Received: from [74.125.82.172] (HELO mail-wy0-f172.google.com) (74.125.82.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Nov 2011 14:52:42 +0000 Received: by wyf28 with SMTP id 28so3740360wyf.31 for ; Sun, 13 Nov 2011 06:52:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=DtMdqFLT3drNJDkVcUeBBlaBf3hjI6gp7Qr1iMQhj/c=; b=X2uB0XQt40jAXZTtScTQuFZADyFZUius+RRePo7NHtcXMeNJ1xx8h51bB/G6QLqWTk aStEV6ypMrmkIXtWh6R054oBiY58NVmqIiDs9B36nWYegzquBUEAdlRfPjeE7soJ8V9t rQQiNMv6sDEBAB/4Bs/BhvYKNEp6gwzh2iWMk= Received: by 10.216.166.212 with SMTP id g62mr3442023wel.29.1321195942541; Sun, 13 Nov 2011 06:52:22 -0800 (PST) Received: from [192.168.1.2] (93-96-159-41.zone4.bethere.co.uk. [93.96.159.41]) by mx.google.com with ESMTPS id fy13sm21243016wbb.18.2011.11.13.06.52.21 (version=SSLv3 cipher=OTHER); Sun, 13 Nov 2011 06:52:21 -0800 (PST) Message-ID: <4EBFD9A5.8010304@gmail.com> Date: Sun, 13 Nov 2011 14:52:21 +0000 From: Guy Incognito User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: indexes from CassandraSF References: <4EBC2754.2080700@gmail.com> <4EBF0865.6000600@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [1] i'm not particularly worried about transient conditions so that's ok. i think there's still the possibility of a non-transient false positive...if 2 writes were to happen at exactly the same time (highly unlikely), eg 1) A reads previous location (L1) from index entries 2) B reads previous location (L1) from index entries 3) A deletes previous location (L1) from index entries 4) B deletes previous location (L1) from index entries 5) A deletes previous location (L1) from index 6) B deletes previous location (L1) from index 7) A enters new location (L2) into index entries 8) B enters new location (L3) into index entries 9 ) A enters new location (L2) into index 10) B enters new location (L3) into index 11) A sets new location (L2) on users 12) B sets new location (L2) on users after this, don't i end up with an incorrect L2 location in index entries and in the index, that won't be resolved until the next write of location for that user? [2] ah i see...so the client would continuously retry until the update works. that's fine provided the client doesn't bomb out with some other error, if that were to happen then i have potentially deleted the index entry columns without deleting the corresponding index columns. i can handle both of the above for my use case, i just want to clarify whether they are possible (however unlikely) scenarios. On 13/11/2011 02:41, Ed Anuff wrote: > 1) The index updates should be eventually consistent. This does mean > that you can get a transient false-positive on your search results. > If this doesn't work for you, then you either need to use ZK or some > other locking solution or do "read repair" by making sure that the row > you retrieve contains the value you're searching for before passing it > on to the rest of your applicaiton. > > 2) You should be able to reapply the batch updates til they succeed. > The update is idempotent. One thing that's important that the slides > don't make clear is that this requires using time-based uuids as your > timestamp components. Take a look at the sample code. > > Hope this helps, > > Ed > > On Sat, Nov 12, 2011 at 3:59 PM, Guy Incognito wrote: >> help? >> >> On 10/11/2011 19:34, Guy Incognito wrote: >>> hi, >>> >>> i've been looking at the model below from Ed Anuff's presentation at >>> Cassandra CF (http://www.slideshare.net/edanuff/indexing-in-cassandra). >>> Couple of questions: >>> >>> 1) Isn't there still the chance that two concurrent updates may end up >>> with the index containing two entries for the given user, only one of which >>> would be match the actual value in the Users cf? >>> >>> 2) What happens if your batch fails partway through the update? If i >>> understand correctly there are no guarantees about ordering when a batch is >>> executed, so isn't it possible that eg the previous >>> value entries in Users_Index_Entries may have been deleted, and then the >>> batch fails before the entries in Indexes are deleted, ie the mechanism has >>> 'lost' those values? I assume this can be addressed >>> by not deleting the old entries until the batch has succeeded (ie put the >>> previous entry deletion into a separate, subsequent batch). this at least >>> lets you retry at a later time. >>> >>> perhaps i'm missing something? >>> >>> SELECT {"location"}..{"location", *} >>> FROM Users_Index_Entries WHERE KEY =; >>> >>> BEGIN BATCH >>> >>> DELETE {"location", ts1}, {"location", ts2}, ... >>> FROM Users_Index_Entries WHERE KEY =; >>> >>> DELETE {,, ts1}, {,, ts2}, ... >>> FROM Indexes WHERE KEY = "Users_By_Location"; >>> >>> UPDATE Users_Index_Entries SET {"location", ts3} = >>> WHERE KEY=; >>> >>> UPDATE Indexes SET {,, ts3) = null >>> WHERE KEY = "Users_By_Location"; >>> >>> UPDATE Users SET location = >>> WHERE KEY =; >>> >>> APPLY BATCH >>> >>