cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Retrieving old data version for a given row
Date Mon, 04 Jun 2012 21:46:59 GMT
This is an old issue with sstable2json https://issues.apache.org/jira/browse/CASSANDRA-4054

Internally the tomstone is associated with the o.a.c.db.AbstractColumnContainer see o.a.c.db.RowMutation.delete()
to see how a row level delete works. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 4/06/2012, at 9:58 PM, Felipe Schmidt wrote:

> I was taking a look at tombstones stored at SSTable and I noticed that if I perform a
key deletion, the tombstone doesn’t have any timestamp, he has this appearance:
> 	“key”:[ ]
> In all the other deletions granularities the tombstone have a timestamp.Without this
information seems to be not possible to solve conflicts when a insertion for the same key
is done after this deletion. If it happens, I think Cassandra will always delete this new
information because of this tombstone.
> I’m using a single node configuration and maybe it change how does tombstones looks
like.
> 
> Thanks in advance.
> 
> Regards,
> Felipe Mathias Schmidt
> (Computer Science UFRGS, RS, Brazil)
> 
> 
> 
> 
> 
> 2012/5/31 aaron morton <aaron@thelastpickle.com>
>> -Is there any other way to stract the contect of SSTable, writing a
>> java program for example instead of using sstable2json?
> Look at the code in sstale2json and copy it :)
> 
>> -I tried to get tombstons using the thrift API, but seems to be not
>> possible, is it right? When I try, the program throws an exception.
> No. 
> Tombstones are not returned from API (See ColumnFamilyStore.getColumnFamily() ). 
> You can see them if you use sstable2json.
> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 30/05/2012, at 9:53 PM, Felipe Schmidt wrote:
> 
>> I have further questions:
>> -Is there any other way to stract the contect of SSTable, writing a
>> java program for example instead of using sstable2json?
>> -I tried to get tombstons using the thrift API, but seems to be not
>> possible, is it right? When I try, the program throws an exception.
>> 
>> thanks in advance
>> 
>> Regards,
>> Felipe Mathias Schmidt
>> (Computer Science UFRGS, RS, Brazil)
>> 
>> 
>> 
>> 
>> 2012/5/24 aaron morton <aaron@thelastpickle.com>:
>>> Ok... it's really strange to me that Cassandra doesn't support data
>>> versioning cause all of other key-value databases support it (at least
>>> those who I know).
>>> 
>>> You can design it into your data model if you need it.
>>> 
>>> 
>>> I have one remaining question:
>>> -in the case that I have more than 1 SSTable in the disk for the same
>>> column but with different data versions, is it possible to make a
>>> 
>>> query to get the old version instead of the newest one?
>>> 
>>> No.
>>> There is only ever 1 value for a column.
>>> The "older" copies of the column in the SSTables are artefacts of immutable
>>> on disk structures.
>>> If you want to see what's inside an SSTable use bin/sstable2json
>>> 
>>> Cheers
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 24/05/2012, at 9:42 PM, Felipe Schmidt wrote:
>>> 
>>> Ok... it's really strange to me that Cassandra doesn't support data
>>> versioning cause all of other key-value databases support it (at least
>>> those who I know).
>>> 
>>> I have one remaining question:
>>> -in the case that I have more than 1 SSTable in the disk for the same
>>> column but with different data versions, is it possible to make a
>>> query to get the old version instead of the newest one?
>>> 
>>> Regards,
>>> Felipe Mathias Schmidt
>>> (Computer Science UFRGS, RS, Brazil)
>>> 
>>> 
>>> 
>>> 
>>> 2012/5/16 Dave Brosius <dbrosius@mebigfatguy.com>:
>>> 
>>> You're in for a world of hurt going down that rabbit hole. If you truely
>>> 
>>> want version data then you should think about changing your keying to
>>> 
>>> perhaps be a composite key where key is of form
>>> 
>>> 
>>> NaturalKey/VersionId
>>> 
>>> 
>>> Or if you want the versioning at the column level, use composite columns
>>> 
>>> with ColumnName/VersionId format
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 05/16/2012 10:16 AM, Felipe Schmidt wrote:
>>> 
>>> 
>>> That was very helpfull, thank you very much!
>>> 
>>> 
>>> I still have some questions:
>>> 
>>> -it is possible to make Cassandra keep old value data after flushing?
>>> 
>>> The same question for the memTable, before flushing. Seems to me that
>>> 
>>> when I update some tuple, the old data will be overwrited in memTable,
>>> 
>>> even before flushing.
>>> 
>>> -it is possible to scan values from the memtable, maybe using the
>>> 
>>> so-called Thrift API? Using the client-api I can just see the newest
>>> 
>>> data version, I can't see what's really happening with the memTable.
>>> 
>>> 
>>> I ask that cause what I'll try to do is a Change Data Capture to
>>> 
>>> Cassandra and the answers will define what kind of aproaches I'm able
>>> 
>>> to use.
>>> 
>>> 
>>> Thanks in advance.
>>> 
>>> 
>>> Regards,
>>> 
>>> Felipe Mathias Schmidt
>>> 
>>> (Computer Science UFRGS, RS, Brazil)
>>> 
>>> 
>>> 
>>> 2012/5/14 aaron morton<aaron@thelastpickle.com>:
>>> 
>>> 
>>> Cassandra does not provide access to multiple versions of the same
>>> 
>>> column.
>>> 
>>> It is essentially implementation detail.
>>> 
>>> 
>>> All mutations are written to the commit log in a binary format, see the
>>> 
>>> o.a.c.db.RowMutation.getSerializedBuffer() (If you want to tail it for
>>> 
>>> analysis you may want to change commitlog_sync in cassandra.yaml)
>>> 
>>> 
>>> Here is post about looking at multiple versions columns in an
>>> 
>>> sstable http://thelastpickle.com/2011/05/15/Deletes-and-Tombstones/
>>> 
>>> 
>>> Remember that not all "versions" of a column are written to disk
>>> 
>>>  (see http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/).
>>> 
>>> Also
>>> 
>>> compaction will compress multiple versions of the same column from
>>> 
>>> multiple
>>> 
>>> files into a single version in a single file .
>>> 
>>> 
>>> Hope that helps.
>>> 
>>> 
>>> 
>>> -----------------
>>> 
>>> Aaron Morton
>>> 
>>> Freelance Developer
>>> 
>>> @aaronmorton
>>> 
>>> http://www.thelastpickle.com
>>> 
>>> 
>>> On 14/05/2012, at 9:50 PM, Felipe Schmidt wrote:
>>> 
>>> 
>>> Yes, I need this information just for academic purposes.
>>> 
>>> 
>>> So, to read old data values, I tried to open the Commitlog using tail
>>> 
>>> -f and also the log files viewer of Ubuntu, but I can not see many
>>> 
>>> informations inside of the log!
>>> 
>>> Is there any other way to open this log? I didn't find any Cassandra
>>> 
>>> API for this purpose.
>>> 
>>> 
>>> Thanks averybody in advance.
>>> 
>>> 
>>> Regards,
>>> 
>>> Felipe Mathias Schmidt
>>> 
>>> (Computer Science UFRGS, RS, Brazil)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 2012/5/14 zhangcheng2<zhangcheng2@software.ict.ac.cn>:
>>> 
>>> 
>>> After compaciton, the old version data will gone!
>>> 
>>> 
>>> 
>>> ________________________________
>>> 
>>> 
>>> zhangcheng2
>>> 
>>> 
>>> 
>>> From: Felipe Schmidt
>>> 
>>> 
>>> Date: 2012-05-14 05:33
>>> 
>>> 
>>> To: user
>>> 
>>> 
>>> Subject: Retrieving old data version for a given row
>>> 
>>> 
>>> I'm trying to retrieve old data version for some row but it seems not
>>> 
>>> 
>>> be possible. I'm a beginner  with Cassandra and the unique aproach I
>>> 
>>> 
>>> know is looking to the SSTable in the storage folder, but if I insert
>>> 
>>> 
>>> some column and right after insert another value to the same row,
>>> 
>>> 
>>> after flushing, I only get the last value.
>>> 
>>> 
>>> Is there any way to get the old data version? Obviously, before
>>> 
>>> compaction.
>>> 
>>> 
>>> 
>>> Regards,
>>> 
>>> 
>>> Felipe Mathias Schmidt
>>> 
>>> 
>>> (Computer Science UFRGS, RS, Brazil)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
> 
> 


Mime
View raw message