cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven A Robenalt <srobe...@stanford.edu>
Subject Re: Read/Write consistency issue
Date Fri, 10 Jan 2014 23:53:23 GMT
Hi Robert,

Just to clarify a bit, there's nothing inherently wrong with a
read-modify-write cycle as you would use for a document store. The
read-before-write antipattern refers to depending on a read immediately
before a write, as was being done in the original post. Generally, such a
read is done to either (a) verify that the underlying record hasn't changed
immediately before updating or (b) to merge updated parts of the document
with those originally excluded from the original read. Obviously, both can
be problematic if concurrent modifications are being performed, or if the
operations required to perform the update are executed concurrently.

The original post was problematic for a different reason - updating the
same column very rapidly with the read-before-write antipattern built into
the update. This fails occasionally because the database is not yet
consistent by the time the next read is performed. The result is an update
that mostly, but not always succeeds.

Using Lightweight Transactions and BatchStatements can address many of
these problems in a normal OLTP environment as with a document store, and
will not be likely to have a negative impact on performance, but rapidly
updated time series data is a different animal, and requires its own
strategies and patterns.

Steve





On Fri, Jan 10, 2014 at 3:24 PM, Todd Carrico <Todd.Carrico@match.com>wrote:

>  I’ve solved this for other systems, and it might work here.
>
>
>
> Add a Guid as a field to the record.
>
> When you update the document, check to make sure the Guid hasn’t changed
> since you read it.  If the Guid is the same, go ahead and save the document
> along with a new Guid.
>
>
>
> This keeps you from locking the document if you just want to read it while
> still keeping you from overwriting someone else’s changes.  In this other
> system, it was easy enough to add the guid check as part of the where
> clause:
>
>
>
> Update doc
>
>                 Set Text = Text
>
> Where key = ?
>
> And Guid = ?
>
>
>
> If the row failed to update, then it was removed, or the Guids didn’t
> match.
>
>
>
> Not sure if C* has some magic that can make this better, timestamp should
> do the same thing I think.
>
>
>
> “There are a multitude of methods whereby a feline might be divested of
> its epidermal layer”..
>
>
>
> *From:* Tupshin Harper [mailto:tupshin@tupshin.com]
> *Sent:* Friday, January 10, 2014 5:13 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: Read/Write consistency issue
>
>
>
> It is bad because of the risk of concurrent modifications. If you don't
> have some kind of global lock on the document/row, then 2 readers might
> read version A, reader 1 writes version B based on A, and reader 2 writes
> version C based on A, overwriting the changes in B. This is *inherent* to
> the notion distributed systems and multiple writers, and can only be fixed
> by:
>
> 1) Having a global lock, either in the form of a DB lock (CAS for
> Cassandra 2.0 and above), or some higher level business mechanism that is
> ensuring only one concurrent reader/writer for a given document
>
> 2) Idempotent writes by appending at write and aggregate on read. For
> time-series and possibly counter style information, this is often the ideal
> strategy, but usually not so good for documents.
>
> For the counters scenario, idempotent writes, or the rewrite of counters
> (which use idempotent writes behind the scenes) are probably good solutions.
>
> Concurrent editing of documents, on the other hand, is almost the ideal
> scenario for lightweight transactions.
>
> -Tupshin
>
>
>
> On Fri, Jan 10, 2014 at 5:51 PM, Robert Wille <rwille@fold3.com> wrote:
>
>  Interested in knowing more on why read-before-write is an anti-pattern.
> In the next month or so, I intend to use Cassandra as a doc store. One very
> common operation will be to read the document, make a change, and write it
> back. These would be interactive users modifying their own documents, so
> rapid repeated writing is not an issue. Why would this be bad?
>
>
>
> Robert
>
>
>
> *From: *Steven A Robenalt <srobenal@stanford.edu>
> *Reply-To: *<user@cassandra.apache.org>
> *Date: *Friday, January 10, 2014 at 3:41 PM
>
>
> *To: *<user@cassandra.apache.org>
> *Subject: *Re: Read/Write consistency issue
>
>
>
> My understanding is that it's generally a Cassandra anti-pattern to do
> read-before-write in any case, not just because of this issue. I'd agree
> with Robert's suggestion earlier in this thread of writing each update
> independently and aggregating on read.
>
>
>
> Steve
>
>
>
>
>



-- 
Steve Robenalt
Software Architect
HighWire | Stanford University
425 Broadway St, Redwood City, CA 94063

srobenal@stanford.edu
http://highwire.stanford.edu

Mime
View raw message