Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 70322C941 for ; Thu, 24 May 2012 09:43:18 +0000 (UTC) Received: (qmail 70530 invoked by uid 500); 24 May 2012 09:43:16 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 70487 invoked by uid 500); 24 May 2012 09:43:16 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 70475 invoked by uid 99); 24 May 2012 09:43:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 May 2012 09:43:15 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of felipefsch@gmail.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gg0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 May 2012 09:43:10 +0000 Received: by ggnc4 with SMTP id c4so8803664ggn.31 for ; Thu, 24 May 2012 02:42:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=d0pjLAfjeiHiw1lFRItzxwScKbWYVLMZfkkdM+a8Xpg=; b=irLvl/Y0iduPamPxGQqqgiPdoppyNbcT7zN2SMYoslmOg5Tg0uxlm/Lt9SJgVKtA0K rMWWY2M/ghBYRCIeU1t2cbBCLWKHipNIr5mBHGkDGIZwA84lwAUE5EShIwFBmv1cPIg3 7oXqv29O7E9hUAkXhAMme6tn1ijG+sBHa6WjYsiv4DjcxqBzD3HYMhABmkU7qz9jWVuu blDtqLsyp/qdb9Wt6XS/DIaOmFGksfNP+pCscPUNFvJbIrVbIhoiPSmQgtZ/zMi+WXkY nC2k9psdYwnWzSc50MEu8mZjcy9PSkRjP1zFP1Ov45mAVUNvYCqBz+nFX+5jnqm0tn+B OaKQ== MIME-Version: 1.0 Received: by 10.236.191.227 with SMTP id g63mr1979839yhn.98.1337852570313; Thu, 24 May 2012 02:42:50 -0700 (PDT) Received: by 10.146.158.2 with HTTP; Thu, 24 May 2012 02:42:50 -0700 (PDT) In-Reply-To: <4FB3B89A.7050803@mebigfatguy.com> References: <201205141035386232944@software.ict.ac.cn> <4FB3B89A.7050803@mebigfatguy.com> Date: Thu, 24 May 2012 11:42:50 +0200 Message-ID: Subject: Re: Retrieving old data version for a given row From: Felipe Schmidt To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Ok... it's really strange to me that Cassandra doesn't support data versioning cause all of other key-value databases support it (at least those who I know). I have one remaining question: -in the case that I have more than 1 SSTable in the disk for the same column but with different data versions, is it possible to make a query to get the old version instead of the newest one? Regards, Felipe Mathias Schmidt (Computer Science UFRGS, RS, Brazil) 2012/5/16 Dave Brosius : > You're in for a world of hurt going down that rabbit hole. If you truely > want version data then you should think about changing your keying to > perhaps be a composite key where key is of form > > NaturalKey/VersionId > > Or if you want the versioning at the column level, use composite columns > with ColumnName/VersionId format > > > > > On 05/16/2012 10:16 AM, Felipe Schmidt wrote: >> >> That was very helpfull, thank you very much! >> >> I still have some questions: >> -it is possible to make Cassandra keep old value data after flushing? >> The same question for the memTable, before flushing. Seems to me that >> when I update some tuple, the old data will be overwrited in memTable, >> even before flushing. >> -it is possible to scan values from the memtable, maybe using the >> so-called Thrift API? Using the client-api I can just see the newest >> data version, I can't see what's really happening with the memTable. >> >> I ask that cause what I'll try to do is a Change Data Capture to >> Cassandra and the answers will define what kind of aproaches I'm able >> to use. >> >> Thanks in advance. >> >> Regards, >> Felipe Mathias Schmidt >> (Computer Science UFRGS, RS, Brazil) >> >> >> 2012/5/14 aaron morton: >>> >>> Cassandra does not provide access to multiple versions of the same >>> column. >>> It is essentially implementation detail. >>> >>> All mutations are written to the commit log in a binary format, see the >>> o.a.c.db.RowMutation.getSerializedBuffer() (If you want to tail it for >>> analysis you may want to change commitlog_sync in cassandra.yaml) >>> >>> Here is post about looking at multiple versions columns in an >>> sstable http://thelastpickle.com/2011/05/15/Deletes-and-Tombstones/ >>> >>> Remember that not all "versions" of a column are written to disk >>> =A0(see http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/). >>> Also >>> compaction will compress multiple versions of the same column from >>> multiple >>> files into a single version in a single file . >>> >>> Hope that helps. >>> >>> >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 14/05/2012, at 9:50 PM, Felipe Schmidt wrote: >>> >>> Yes, I need this information just for academic purposes. >>> >>> So, to read old data values, I tried to open the Commitlog using tail >>> -f and also the log files viewer of Ubuntu, but I can not see many >>> informations inside of the log! >>> Is there any other way to open this log? I didn't find any Cassandra >>> API for this purpose. >>> >>> Thanks averybody in advance. >>> >>> Regards, >>> Felipe Mathias Schmidt >>> (Computer Science UFRGS, RS, Brazil) >>> >>> >>> >>> >>> 2012/5/14 zhangcheng2: >>> >>> After compaciton, the old version data will gone! >>> >>> >>> ________________________________ >>> >>> zhangcheng2 >>> >>> >>> From: Felipe Schmidt >>> >>> Date: 2012-05-14 05:33 >>> >>> To: user >>> >>> Subject: Retrieving old data version for a given row >>> >>> I'm trying to retrieve old data version for some row but it seems not >>> >>> be possible. I'm a beginner =A0with Cassandra and the unique aproach I >>> >>> know is looking to the SSTable in the storage folder, but if I insert >>> >>> some column and right after insert another value to the same row, >>> >>> after flushing, I only get the last value. >>> >>> Is there any way to get the old data version? Obviously, before >>> compaction. >>> >>> >>> Regards, >>> >>> Felipe Mathias Schmidt >>> >>> (Computer Science UFRGS, RS, Brazil) >>> >>> >>> >