incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Retrieving old data version for a given row
Date Thu, 24 May 2012 10:25:25 GMT
> Ok... it's really strange to me that Cassandra doesn't support data
> versioning cause all of other key-value databases support it (at least
> those who I know).
You can design it into your data model if you need it.
 
> I have one remaining question:
> -in the case that I have more than 1 SSTable in the disk for the same
> column but with different data versions, is it possible to make a
> query to get the old version instead of the newest one?
No.
There is only ever 1 value for a column. 
The "older" copies of the column in the SSTables are artefacts of immutable on disk structures.

If you want to see what's inside an SSTable use bin/sstable2json

Cheers
  
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/05/2012, at 9:42 PM, Felipe Schmidt wrote:

> Ok... it's really strange to me that Cassandra doesn't support data
> versioning cause all of other key-value databases support it (at least
> those who I know).
> 
> I have one remaining question:
> -in the case that I have more than 1 SSTable in the disk for the same
> column but with different data versions, is it possible to make a
> query to get the old version instead of the newest one?
> 
> Regards,
> Felipe Mathias Schmidt
> (Computer Science UFRGS, RS, Brazil)
> 
> 
> 
> 
> 2012/5/16 Dave Brosius <dbrosius@mebigfatguy.com>:
>> You're in for a world of hurt going down that rabbit hole. If you truely
>> want version data then you should think about changing your keying to
>> perhaps be a composite key where key is of form
>> 
>> NaturalKey/VersionId
>> 
>> Or if you want the versioning at the column level, use composite columns
>> with ColumnName/VersionId format
>> 
>> 
>> 
>> 
>> On 05/16/2012 10:16 AM, Felipe Schmidt wrote:
>>> 
>>> That was very helpfull, thank you very much!
>>> 
>>> I still have some questions:
>>> -it is possible to make Cassandra keep old value data after flushing?
>>> The same question for the memTable, before flushing. Seems to me that
>>> when I update some tuple, the old data will be overwrited in memTable,
>>> even before flushing.
>>> -it is possible to scan values from the memtable, maybe using the
>>> so-called Thrift API? Using the client-api I can just see the newest
>>> data version, I can't see what's really happening with the memTable.
>>> 
>>> I ask that cause what I'll try to do is a Change Data Capture to
>>> Cassandra and the answers will define what kind of aproaches I'm able
>>> to use.
>>> 
>>> Thanks in advance.
>>> 
>>> Regards,
>>> Felipe Mathias Schmidt
>>> (Computer Science UFRGS, RS, Brazil)
>>> 
>>> 
>>> 2012/5/14 aaron morton<aaron@thelastpickle.com>:
>>>> 
>>>> Cassandra does not provide access to multiple versions of the same
>>>> column.
>>>> It is essentially implementation detail.
>>>> 
>>>> All mutations are written to the commit log in a binary format, see the
>>>> o.a.c.db.RowMutation.getSerializedBuffer() (If you want to tail it for
>>>> analysis you may want to change commitlog_sync in cassandra.yaml)
>>>> 
>>>> Here is post about looking at multiple versions columns in an
>>>> sstable http://thelastpickle.com/2011/05/15/Deletes-and-Tombstones/
>>>> 
>>>> Remember that not all "versions" of a column are written to disk
>>>>  (see http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/).
>>>> Also
>>>> compaction will compress multiple versions of the same column from
>>>> multiple
>>>> files into a single version in a single file .
>>>> 
>>>> Hope that helps.
>>>> 
>>>> 
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>> 
>>>> On 14/05/2012, at 9:50 PM, Felipe Schmidt wrote:
>>>> 
>>>> Yes, I need this information just for academic purposes.
>>>> 
>>>> So, to read old data values, I tried to open the Commitlog using tail
>>>> -f and also the log files viewer of Ubuntu, but I can not see many
>>>> informations inside of the log!
>>>> Is there any other way to open this log? I didn't find any Cassandra
>>>> API for this purpose.
>>>> 
>>>> Thanks averybody in advance.
>>>> 
>>>> Regards,
>>>> Felipe Mathias Schmidt
>>>> (Computer Science UFRGS, RS, Brazil)
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 2012/5/14 zhangcheng2<zhangcheng2@software.ict.ac.cn>:
>>>> 
>>>> After compaciton, the old version data will gone!
>>>> 
>>>> 
>>>> ________________________________
>>>> 
>>>> zhangcheng2
>>>> 
>>>> 
>>>> From: Felipe Schmidt
>>>> 
>>>> Date: 2012-05-14 05:33
>>>> 
>>>> To: user
>>>> 
>>>> Subject: Retrieving old data version for a given row
>>>> 
>>>> I'm trying to retrieve old data version for some row but it seems not
>>>> 
>>>> be possible. I'm a beginner  with Cassandra and the unique aproach I
>>>> 
>>>> know is looking to the SSTable in the storage folder, but if I insert
>>>> 
>>>> some column and right after insert another value to the same row,
>>>> 
>>>> after flushing, I only get the last value.
>>>> 
>>>> Is there any way to get the old data version? Obviously, before
>>>> compaction.
>>>> 
>>>> 
>>>> Regards,
>>>> 
>>>> Felipe Mathias Schmidt
>>>> 
>>>> (Computer Science UFRGS, RS, Brazil)
>>>> 
>>>> 
>>>> 
>> 


Mime
View raw message