incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Felipe Schmidt <felipef...@gmail.com>
Subject Re: Retrieving old data version for a given row
Date Thu, 24 May 2012 09:42:50 GMT
Ok... it's really strange to me that Cassandra doesn't support data
versioning cause all of other key-value databases support it (at least
those who I know).

I have one remaining question:
-in the case that I have more than 1 SSTable in the disk for the same
column but with different data versions, is it possible to make a
query to get the old version instead of the newest one?

Regards,
Felipe Mathias Schmidt
(Computer Science UFRGS, RS, Brazil)




2012/5/16 Dave Brosius <dbrosius@mebigfatguy.com>:
> You're in for a world of hurt going down that rabbit hole. If you truely
> want version data then you should think about changing your keying to
> perhaps be a composite key where key is of form
>
> NaturalKey/VersionId
>
> Or if you want the versioning at the column level, use composite columns
> with ColumnName/VersionId format
>
>
>
>
> On 05/16/2012 10:16 AM, Felipe Schmidt wrote:
>>
>> That was very helpfull, thank you very much!
>>
>> I still have some questions:
>> -it is possible to make Cassandra keep old value data after flushing?
>> The same question for the memTable, before flushing. Seems to me that
>> when I update some tuple, the old data will be overwrited in memTable,
>> even before flushing.
>> -it is possible to scan values from the memtable, maybe using the
>> so-called Thrift API? Using the client-api I can just see the newest
>> data version, I can't see what's really happening with the memTable.
>>
>> I ask that cause what I'll try to do is a Change Data Capture to
>> Cassandra and the answers will define what kind of aproaches I'm able
>> to use.
>>
>> Thanks in advance.
>>
>> Regards,
>> Felipe Mathias Schmidt
>> (Computer Science UFRGS, RS, Brazil)
>>
>>
>> 2012/5/14 aaron morton<aaron@thelastpickle.com>:
>>>
>>> Cassandra does not provide access to multiple versions of the same
>>> column.
>>> It is essentially implementation detail.
>>>
>>> All mutations are written to the commit log in a binary format, see the
>>> o.a.c.db.RowMutation.getSerializedBuffer() (If you want to tail it for
>>> analysis you may want to change commitlog_sync in cassandra.yaml)
>>>
>>> Here is post about looking at multiple versions columns in an
>>> sstable http://thelastpickle.com/2011/05/15/Deletes-and-Tombstones/
>>>
>>> Remember that not all "versions" of a column are written to disk
>>>  (see http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/).
>>> Also
>>> compaction will compress multiple versions of the same column from
>>> multiple
>>> files into a single version in a single file .
>>>
>>> Hope that helps.
>>>
>>>
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 14/05/2012, at 9:50 PM, Felipe Schmidt wrote:
>>>
>>> Yes, I need this information just for academic purposes.
>>>
>>> So, to read old data values, I tried to open the Commitlog using tail
>>> -f and also the log files viewer of Ubuntu, but I can not see many
>>> informations inside of the log!
>>> Is there any other way to open this log? I didn't find any Cassandra
>>> API for this purpose.
>>>
>>> Thanks averybody in advance.
>>>
>>> Regards,
>>> Felipe Mathias Schmidt
>>> (Computer Science UFRGS, RS, Brazil)
>>>
>>>
>>>
>>>
>>> 2012/5/14 zhangcheng2<zhangcheng2@software.ict.ac.cn>:
>>>
>>> After compaciton, the old version data will gone!
>>>
>>>
>>> ________________________________
>>>
>>> zhangcheng2
>>>
>>>
>>> From: Felipe Schmidt
>>>
>>> Date: 2012-05-14 05:33
>>>
>>> To: user
>>>
>>> Subject: Retrieving old data version for a given row
>>>
>>> I'm trying to retrieve old data version for some row but it seems not
>>>
>>> be possible. I'm a beginner  with Cassandra and the unique aproach I
>>>
>>> know is looking to the SSTable in the storage folder, but if I insert
>>>
>>> some column and right after insert another value to the same row,
>>>
>>> after flushing, I only get the last value.
>>>
>>> Is there any way to get the old data version? Obviously, before
>>> compaction.
>>>
>>>
>>> Regards,
>>>
>>> Felipe Mathias Schmidt
>>>
>>> (Computer Science UFRGS, RS, Brazil)
>>>
>>>
>>>
>

Mime
View raw message