cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carlos Alonso <i...@mrcalonso.com>
Subject Re: Strategies for avoiding corrupted duplicate data?
Date Thu, 17 Mar 2016 08:48:11 GMT
Hi Max.

The best way to achieve this is by actually having idempotent operations.
So, in your particular case, is there any chance for a user to update his
email twice at the same time with different values? Given that Cassandra
upserts information, even two inserts in the same table with the same
values will lead to just one.

If you cannot guarantee that, then a good way to achieve this I think is by
read-before-write + optimistic locking. i.e. Adding a 'version' field on
the record and running a LWT.

For example:

UPDATE users_by_email set email = 'new_email', version = 2 where ... IF
version = 1 (version number previously read, of course)

Hope this helps.

Carlos Alonso | Software Engineer | @calonso <https://twitter.com/calonso>

On 17 March 2016 at 07:22, Max C <mc_cassandra@core43.com> wrote:

> Hello,
>
> What are your best practices for avoiding collisions when updating
> duplicate (derived) data?  For example if I have tables like this:
>
> users — (username, email, phone) — PK (username)
> users_by_email — (username, email, phone) — PK (email)
> users_by_phone — (username, email, phone) — PK ((phone), username)
>
> … and I want to change the user’s email address ...
>
> What is your strategy for ensuring that two writers don’t try to update
> the same records at the same time?
>
> Some sort of lock?
>
> 1) Do you have a global “locks” table in Cassandra, and use a LWT to lock
> it?
>
> insert into locks (what, locked_by) values ('users:maxc',
> ’server1_pid1234') if not exists;
> # Verify insert succeeded
> # Now that I have the lock, grab the latest
> select * from users where username=‘maxc’;
> begin batch
> # … apply changes ...
> apply batch;
> # release lock
> delete from locks where what=‘users:maxc’;
>
> 2) Do you add a “locked_by” column to the master (“users”) table, and then
> use a LWT to lock it?
>
> update users set locked_by=‘server1_pid1234’’ where username=“maxc” if
> locked_by = null;
> # Verify update succeeded
> # Now that I have the lock, grab the latest
> select * from users where username=‘maxc’;
> begin batch
> # … apply changes ...
> apply batch;
> # release lock
> update users set locked_by=null where username=‘maxc’;
>
> 3) Do you use something outside of Cassandra to manage the locks?  Zoo
> keeper?
>
> ## Acquire external lock ##
> # Now that I have the lock, grab the latest
> select * from users where username=‘maxc’;
> begin batch
> # … apply changes ...
> apply batch;
> ## release external lock ##
>
> Or is there some other way to do this that I’m totally missing??
> Materialized views in 3.x, I suppose.  Other ideas?
>
> Thanks!
>
> - Max
>

Mime
View raw message