cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Verstrynge <>
Subject Re: What happens if there is a collision?
Date Fri, 22 Oct 2010 01:11:35 GMT
On 22/10/2010 2:27, Nicholas Knight wrote:
> On Oct 22, 2010, at 7:41 AM, Jérôme Verstrynge wrote:
>> Let's imagine that A initiates its column write at: 334450 ms with 'AAA' and timestamp
334450 ms
>> Let's imagine that E initiates its column write at: 334451 ms with 'ZZZ'and timestamp
334450 ms
>> (E is the latest write)
>> Let's imagine that A reaches C at 334455 ms and performs its write.
>> Let's imagine that E reaches C at 334456 ms and attempts to performs its write. It
will loose the timestamp-tie ('AAA' is greater than 'ZZZ').
> How is this any different from E's perspective than if A had come along a moment later
with timestamp 334452?
If this results in only one entry, then I am happy. If this results in 
two entries (334450 and 334452), then the situation is different and 
does not correspond to my argument.

When I read, the column 
section explicitely says: "All values are supplied by the client, 
including the 'timestamp'."

Hence, there is nothing that explicitely guarantees that only one record 
is created from this documentation.

> What you describe is an application in *desperate* need of either a serious redesign,
or a distributed locking mechanism.
> This really isn't a Cassandra-specific problem, Cassandra just happens to be the distributed
storage system at issue. Any such system without a locking mechanism will present some form
of this problem, and the answer will be the same: Avoid it in the application design, or incorporate
a locking mechanism into the application.
I agree about the problem not being specific to Cassandra. I have 
nothing against Cassandra. In fact, I am facinated by it and consider 
using it in my own projects.

>> If there is a timestamp-tie, then the context becomes uncertain for E, out of the
>> If application E can't be sure about what has been saved in Cassandra, it cannot
rely on what it has in memory. It is a vicious circle. It can't anticipate on the potential
actions of A on the column too.
> And how is this different from E's data being overwritten with a later timestamp? Either
way, what E thinks is in Cassandra really isn't.
Well, E knows that it can't predict the value for future timestamps 
values coming from other nodes. Fine. What I am worried about is that it 
can't predict the value for its own timestamp.

> If you need to make sure you have consistency at this level, you *need* a locking mechanism.
>> This is unsual for any application, but may be this is the price to pay for using
Cassandra. Fair enough.
> Hardly. Any non-serial application that doesn't use some form of locking has this exact
same problem at all levels of storage, possibly even in its internal variables.
I have not argued against locking as a potential solution. I am only 
suggesting something lighter.

>> If E is not informed of the timestamp tie, then it is left alone in the dark. Hence,
this is why I say Cassandra is not deterministic to E. The result of a write is potentially
non-deterministic in what it actually performs.
> Cassandra is deterministic for a given input. What you're saying is you aren't properly
controlling the input that your application is giving it.
You are making my point (lol). No matter what an application writes, it 
should re-read its owns write for determinism for a given timestamp when 
other application instances are writing in the same 'table'.

>> If E was aware that it lost a timestamp-tie, it would know that there is a possible
gap between its internal memory representation and what it tried to save into Cassandra. That
is, EVEN if there is no further write on that same column (or, in other words, regardless
of any potential subsequent races).
> What is the significance of this?
If you know there is no timestamp collision, then you know you don't 
need to re-read for determinism. Otherwise you should. In a situation 
where you can't know, you should automatically re-read, which is 
expensive (or implement a locking mechanism).

>> If E was informed it lost a timestamp-tie, it could re-read the column (and let's
assume that there is no further write in between, but this does not change anything to the
argument). It could spot that its write for timestamp value 334450 ms failed, and also the
reason why ('AAA' greater than 'ZZZ). It could operate a new write, which eventually could
result in another timestamp-tie, but at least it would be informed about it too... It would
have a safety net.
> To what end? A and E would apparently get into some sort of never-ending fight. The application
as described is broken and needs to be fixed.
No, no fight since E would know it can't win because it has the lower 
hand 'ZZZ' for the given timestamp.

>> The case I am trying to cover is the case where the context for application E becomes
invalid because of a successful write call to Cassandra without registration of 'ZZZ'. How
can Cassandra call it a successful write, when in fact, it isn't for application E? I believe
Cassandra should notify application E one way or another. This is why I mentioned an extra
timestamp-tie flag in the write ACK sent by nodes back to node E.
> Here's part of the problem. You're seeing E as a distinct application from A which can
behave completely independently. You need to stop thinking like that. It leads to broken architectures
> Even if the E and A processes come from entirely different code bases, you need to start
by thinking of them as one application. That application is broken.
I am not going to argue this, because it is not related to my argument. 
I mean no offense by saying this.

>> The subsequent question I have is:
>> If 'value breaks timestamp-tie', how does Cassandra behave in case of updates? If
there is a column with value 'AAA' at 334450 ms and an application explicitely wants to update
this value to 'ZZZ' for 334450 ms, it seems like the timestamp-tie will prevent that. Hence,
the update/mutation would be undeterministic to E. It seems like one should first delete the
existing record and write a new one (and that could lead to race conditions and timestamp-ties
> You need a locking mechanism. Timestamps aren't the droids you're looking for.
In this case, I do agree that explicit updates on a given timestamp 
can't be achieved without locks.

>> I think this should be documented, because engineers will hit that 'local' undeterministic
issue for sure if two instances of their applications perform 'completed writes' in the same
column family. Completed does not mean successful, even with quorum (or ALL). They ought to
know it.
> I'm honestly not sure why they wouldn't. One need only perform a very cursory investigation
of Cassandra to realize that addition of a locking mechanism is necessary for many applications,
such as the one described here.
Again, I am not saying locks are not a solution. I was just suggesting a 
lighter solution for the issue I was raising. Implementing locks in 
Cassandra-like system is tricky. The proposed solutions so far are 
costly and heavy.
> -NK
Thanks for your answer.


View raw message