cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "olek.stasiak@gmail.com" <olek.stas...@gmail.com>
Subject Re: Storage: upsert vs. delete + insert
Date Wed, 10 Sep 2014 19:25:02 GMT
I think so.
this is how i see it:
on the very beginning you have such line in datafile:
{key: [col_name, col_value, date_of_last_change]} //something similar,
i don't remember now

after delete you're adding line:
{key:[col_name, last_col_value, date_of_delete, 'd']} //this d
indicates that field is deleted
after insert the following line is added:
{key: [col_name, col_value, date_of_insert]}
so delete and then insert generates 2 lines in datafile.

after pure insert (upsert in fact) you will have only one line
{key: [col_name, col_value, date_of_insert]}
So, summarizing, in second scenario you have only one line, in first: two.
I hope my post is correct ;)
regards,
Olek

2014-09-10 18:56 GMT+02:00 Michal Budzyn <michalbudzyn@gmail.com>:
> Would the factor before compaction be always 2 ?
>
> On Wed, Sep 10, 2014 at 6:38 PM, olek.stasiak@gmail.com
> <olek.stasiak@gmail.com> wrote:
>>
>> IMHO, delete then insert will take two times more disk space then
>> single insert. But after compaction the difference will disappear.
>> This was true in version prior to 2.0, but it should still work this
>> way. But maybe someone will correct me, if i'm wrong.
>> Cheers,
>> Olek
>>
>> 2014-09-10 18:30 GMT+02:00 Michal Budzyn <michalbudzyn@gmail.com>:
>> > One insert would be much better e.g. for performance and network
>> > latency.
>> > I wanted to know if there is a significant difference (apart from
>> > additional
>> > commit log entry) in the used storage between these 2 use cases.
>> >
>
>

Mime
View raw message