Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of tivv00@gmail.com designates
 209.85.220.170 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <9B65FDA2-7975-495A-B87F-5D0D013C9157@thelastpickle.com>
References: <11C8B93C-00D6-4871-9050-1D710BA298C1@venarc.com>
	<9B65FDA2-7975-495A-B87F-5D0D013C9157@thelastpickle.com>
Date: Thu, 4 Apr 2013 22:53:06 +0300
Message-ID: 
 <CABWW-d2JX73RQuRy9BRgM3Eyq32grZ6Yy4y3QSBR75KttJkmPg@mail.gmail.com>
Subject: Re: Any plans for read-before-write update operations in CQL3?
From: Vitalii Tymchyshyn <tivv00@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=20cf3079c100bfc83d04d98e518f

--20cf3079c100bfc83d04d98e518f
Content-Type: text/plain; charset=KOI8-U
Content-Transfer-Encoding: quoted-printable

Well, a schema've just came to my mind, that looks interesting, so I want
to share:
1) Actions are introduced. Each action receives unique I'd at coordinator
node. Client can ask for a block of ids beforehand, to make actions
idempotent.
2) Actions are applied to given row+column value. It's possible that
special column family type should be created that support actions.
3) Actions are stored for grace period to ensure repair will be working
well.
4) Along with all the actions for grace period, old value, current value
and old value hash is stored.
5) Old value is the value without currently stored actions, current value
has all currently stored actions applied
6) Old value hash has number of actions applied, time of last action
applied and hash of all the applied actions ids  (only actions applied to
old value of course).
7) Current value is updated on read. So there can be actions that are not
applied yet. So on read, if there are unapplied actions, they are applied
and information about current value/applied actions is updated.
8) Actions can rely on order or not rely on order. If actions rely on order
and during update it is needed to apply out of order action, value is
recalculated, starting from old value.
9) During repair, highest (based on number of actions applied, then lowest
by time) old value is selected. Then all actions older or of the same time
of old value are dropped as already applied. Newer are merged into union
set.
10) During compaction, old value is moved to the now-grace period time.
The schema looks solid. Minus is that all the values for grace period must
be stored. May be it should be combined with some auto confirmation
mechanism when coordinator, after receiving acks for all the writes does
the second round notifying that action is fully written. This should work
for hinted handoff too. Than, old value can be propagated to the last acked
action.

4 =CB=D7=A6=D4. 2013 04:59, "aaron morton" <aaron@thelastpickle.com> =CE=C1=
=D0=C9=D3.
>
> I would guess not.
>
>> I know this goes against keeping updates idempotent,
>
> There are also issues with consistency. i.e. is the read local or does it
happen at the CL level ?
> And it makes things go slower.
>
>>  We currently do things like this in client code, but it would be great
to be able to this on the server side to minimize the chance of race
conditions.
>
> Sometimes you can write the plus one into a new column and then apply the
changes in the reading client thread.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/04/2013, at 12:48 AM, Drew Kutcharian <drew@venarc.com> wrote:
>
>> Hi Guys,
>>
>> Are there any short/long term plans to support UPDATE operations that
require read-before-write, such as increment on a numeric non-counter
column?
>> i.e.
>>
>> UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN =3D NON_COUNTER_NUMERIC_COLUMN =
+
1;
>>
>> UPDATE CF SET STRING_COLUMN =3D STRING_COLUMN + "postfix";
>>
>> etc.
>>
>> I know this goes against keeping updates idempotent, but there are times
you need to do these kinds of operations. We currently do things like this
in client code, but it would be great to be able to this on the server side
to minimize the chance of race conditions.
>>
>> -- Drew
>
>

--20cf3079c100bfc83d04d98e518f
Content-Type: text/html; charset=KOI8-U
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr">Well, a schema&#39;ve just came to my mind, that looks inter=
esting, so I want to share:<br>
1) Actions are introduced. Each action receives unique I&#39;d at coordinat=
or node. Client can ask for a block of ids beforehand, to make actions idem=
potent.<br>
2) Actions are applied to given row+column value. It&#39;s possible that sp=
ecial column family type should be created that support actions.<br>
3) Actions are stored for grace period to ensure repair will be working wel=
l.<br>
4) Along with all the actions for grace period, old value, current value an=
d old value hash is stored.<br>
5) Old value is the value without currently stored actions, current value h=
as all currently stored actions applied<br>
6) Old value hash has number of actions applied, time of last action applie=
d and hash of all the applied actions ids=9A (only actions applied to old v=
alue of course).<br>
7) Current value is updated on read. So there can be actions that are not a=
pplied yet. So on read, if there are unapplied actions, they are applied an=
d information about current value/applied actions is updated.<br>
8) Actions can rely on order or not rely on order. If actions rely on order=
 and during update it is needed to apply out of order action, value is reca=
lculated, starting from old value.<br>
9) During repair, highest (based on number of actions applied, then lowest =
by time) old value is selected. Then all actions older or of the same time =
of old value are dropped as already applied. Newer are merged into union se=
t. <br>

10) During compaction, old value is moved to the now-grace period time.<br>
The schema looks solid. Minus is that all the values for grace period must =
be stored. May be it should be combined with some auto confirmation mechani=
sm when coordinator, after receiving acks for all the writes does the secon=
d round notifying that action is fully written. This should work for hinted=
 handoff too. Than, old value can be propagated to the last acked action.</=
p>

<p dir=3D"ltr">4 =CB=D7=A6=D4. 2013 04:59, &quot;aaron morton&quot; &lt;<a =
href=3D"mailto:aaron@thelastpickle.com">aaron@thelastpickle.com</a>&gt; =CE=
=C1=D0=C9=D3.<br>
&gt;<br>
&gt; I would guess not.=9A<br>
&gt;<br>
&gt;&gt; I know this goes against keeping updates idempotent,=9A<br>
&gt;<br>
&gt; There are also issues with consistency. i.e. is the read local or does=
 it happen at the CL level ?=9A<br>
&gt; And it makes things go slower.<br>
&gt;<br>
&gt;&gt; =9AWe currently do things like this in client code, but it would b=
e great to be able to this on the server side to minimize the chance of rac=
e conditions.<br>
&gt;<br>
&gt; Sometimes you can write the plus one into a new column and then apply =
the changes in the reading client thread.=9A<br>
&gt;<br>
&gt; Cheers<br>
&gt;<br>
&gt; -----------------<br>
&gt; Aaron Morton<br>
&gt; Freelance Cassandra Consultant<br>
&gt; New Zealand<br>
&gt;<br>
&gt; @aaronmorton<br>
&gt; <a href=3D"http://www.thelastpickle.com">http://www.thelastpickle.com<=
/a><br>
&gt;<br>
&gt; On 4/04/2013, at 12:48 AM, Drew Kutcharian &lt;<a href=3D"mailto:drew@=
venarc.com">drew@venarc.com</a>&gt; wrote:<br>
&gt;<br>
&gt;&gt; Hi Guys,<br>
&gt;&gt;<br>
&gt;&gt; Are there any short/long term plans to support UPDATE operations t=
hat require read-before-write, such as increment on a numeric non-counter c=
olumn? <br>
&gt;&gt; i.e. <br>
&gt;&gt;<br>
&gt;&gt; UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN =3D NON_COUNTER_NUMERIC_C=
OLUMN + 1;<br>
&gt;&gt;<br>
&gt;&gt; UPDATE CF SET STRING_COLUMN =3D STRING_COLUMN + &quot;postfix&quot=
;;<br>
&gt;&gt;<br>
&gt;&gt; etc.<br>
&gt;&gt;<br>
&gt;&gt; I know this goes against keeping updates idempotent, but there are=
 times you need to do these kinds of operations. We currently do things lik=
e this in client code, but it would be great to be able to this on the serv=
er side to minimize the chance of race conditions.<br>

&gt;&gt;<br>
&gt;&gt; -- Drew<br>
&gt;<br>
&gt;<br>
</p>

--20cf3079c100bfc83d04d98e518f--