cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Benefit of LOCAL_SERIAL consistency
Date Fri, 09 Dec 2016 00:35:00 GMT
On Thu, Dec 8, 2016 at 5:10 AM, Sylvain Lebresne <sylvain@datastax.com>
wrote:

> > The reason you don't want to use SERIAL in multi-DC clusters
>
> I'm not a fan of blanket statements like that. There is a high cost to
> SERIAL
> consistency in multi-DC setups, but if you *need* global linearizability,
> then
> you have no choice and the latency may be acceptable for your use case.
> Take
> the example of using LWT to ensure no 2 user creates accounts with the same
> name in your system: it's something you don't want to screw up, but it's
> also
> something for which a high-ish latency is probably acceptable. I don't
> think
> users would get super pissed off because registering a new account on some
> service takes 500ms.
>
> So yes it's costly, as is most things that willingly depends on cross-DC
> latency, but I don't think that means it's never ever useful.
>
> > So, I am not sure about what is the good use case for LOCAL_SERIAL.
>
> Well, a good use case is when you're ok with operations within a
> datacenter to
> be linearizable, but can accept 2 operations in different datacenters to
> not be.
> Imagine a service that pins a given user to a DC on login for different
> reasons,
> that service might be fine using LOCAL_SERIAL for operations confined to a
> given user session since it knows it's DC local.
>
> So I think both SERIAL and LOCAL_SERIAL have their uses, though we
> absolutely
> agree they are not meant to be used together. And it's certainly worth
> trying to
> design your system in a way that make sure LOCAL_SERIAL is enough for you,
> if
> you can, since SERIAL is pretty costly. But that doesn't mean there isn't
> case
> where you care more about global linearizability than latency: engineering
> is
> all about trade-offs.
>
> > I am not sure what of the state of this is anymore but I was under the
> > impression the linearizability of lwt was in question. I never head it
> > specifically addressed.
>
> That's a pretty vague statement to make, let's not get into FUD. You
> "might" be
> thinking of a fairly old blog post by Aphyr that tested LWT in their very
> early
> days and they were bugs indeed, but those were fixed a long time ago. Since
> then, his tests and much more were performed
> (http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen)
> and no problem with linearizability that I know of has been found. Don't
> get me
> wrong, any code can have subtle bug and not finding problems doesn't
> guarantee
> there isn't one, but if someone has demonstrated legit problems with the
> linearizability of LWT, it's unknown to me and I'm watching this pretty
> carefully.
>
> I'll note to be complete that I'm not pretending the LWT implementation is
> perfect, it's not (it's slow for one), and using them correctly can be more
> challenging that it may sound at first (mostly because you need to handle
> query timeouts properly and that's not always simple, sometimes requiring
> a more complex data model that you'd want), but those are not break of
> linearizability.
>
> > https://issues.apache.org/jira/browse/CASSANDRA-6106
>
> That ticket has nothing to do with LWT. In fact, LWT is the one mechanism
> in
> Cassandra where this ticket has not impact whatsoever because the whole
> point of
> the mechanism is to ensure timestamps are assigned in a collision free
> manner.
>
>
> On Thu, Dec 8, 2016 at 8:32 AM, Hiroyuki Yamada <mogwaing@gmail.com>
> wrote:
>
>> Hi DuyHai,
>>
>> Thank you for the comments.
>> Yes, that's exactly what I mean.
>> (Your comment is very helpful to support my opinion.)
>>
>> As you said, SERIAL with multi-DCs incurs latency increase,
>> but it's a trade-off between latency and high availability bacause one
>> DC can be down from a disaster.
>> I don't think there is any way to achieve global linearlizability
>> without latency increase, right ?
>>
>> > Edward
>> Thank you for the ticket.
>> I'll read it through.
>>
>> Thanks,
>> Hiro
>>
>> On Thu, Dec 8, 2016 at 12:01 AM, Edward Capriolo <edlinuxguru@gmail.com>
>> wrote:
>> >
>> >
>> > On Wed, Dec 7, 2016 at 8:25 AM, DuyHai Doan <doanduyhai@gmail.com>
>> wrote:
>> >>
>> >> The reason you don't want to use SERIAL in multi-DC clusters is the
>> >> prohibitive cost of lightweight transaction (in term of latency),
>> especially
>> >> if your data centers are separated by continents. A ping from London
>> to New
>> >> York takes 52ms just by speed of light in optic cable. Since
>> LightWeight
>> >> Transaction involves 4 network round-trips, it means at least 200ms
>> just for
>> >> raw network transfer, not even taking into account the cost of
>> processing
>> >> the operation....
>> >>
>> >> You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
>> >> LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL
>> guarantees
>> >> you linearizability across multiple DC.
>> >>
>> >> If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an
>> INSERT IF
>> >> NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a
>> subsequent
>> >> INSERT IF NOT EXISTS on the same record succeeds when using SERIAL
>> because
>> >> SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which
>> respond
>> >> can come from DC2 and DC3 and thus did not apply yet the previous
>> INSERT...
>> >>
>> >> On Wed, Dec 7, 2016 at 2:14 PM, Hiroyuki Yamada <mogwaing@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I have been using lightweight transactions for several months now and
>> >>> wondering what is the benefit of having LOCAL_SERIAL serial
>> consistency
>> >>> level.
>> >>>
>> >>> With SERIAL, it achieves global linearlizability,
>> >>> but with LOCAL_SERIAL, it only achieves DC-local linearlizability,
>> >>> which is missing point of linearlizability, I think.
>> >>>
>> >>> So, for example,
>> >>> once when SERIAL is used,
>> >>> we can't use LOCAL_SERIAL to achieve local linearlizability
>> >>> since data in local DC might not be updated yet to meet quorum.
>> >>> And vice versa,
>> >>> once when LOCAL_SERIAL is used,
>> >>> we can't use SERIAL to achieve global linearlizability
>> >>> since data is not globally updated yet to meet quorum .
>> >>>
>> >>> So, it would be great if we can use LOCAL_SERIAL if possible and
>> >>> use SERIAL only if local DC is down or unavailable,
>> >>> but based on the example above, I think it is not possible, is it ?
>> >>> So, I am not sure about what is the good use case for LOCAL_SERIAL.
>> >>>
>> >>> The only case that I can think of is having a cluster in one DC for
>> >>> online transactions and
>> >>> having another cluster in another DC for analytics purpose.
>> >>> In this case, I think there is no big point of using SERIAL since data
>> >>> for analytics sometimes doesn't have to be very correct/fresh and
>> >>> data can be asynchronously replicated to analytics node. (so using
>> >>> LOCAL_SERIAL for one DC makes sense.)
>> >>>
>> >>> Could anyone give me some thoughts about it ?
>> >>>
>> >>> Thanks,
>> >>> Hiro
>> >>
>> >>
>> >
>> > You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
>> > LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL
>> guarantees
>> > you linearizability across multiple DC.
>> >
>> > I am not sure what of the state of this is anymore but I was under the
>> > impression the linearizability of lwt was in question. I never head it
>> > specifically addressed.
>> >
>> > https://issues.apache.org/jira/browse/CASSANDRA-6106
>> >
>> > Its hard to follow 6106 because most of the tasks are closed 'fix
>> later'  or
>> > closed 'not a problem' .
>>
>
>
I copied the wrong issue:

The core issue was this:
https://issues.apache.org/jira/browse/CASSANDRA-6123

Which I believe was one of the key "call me maybe" Created issues.

6123 references: this https://issues.apache.org/jira/browse/CASSANDRA-8892

Which duplicates:

https://issues.apache.org/jira/browse/CASSANDRA-6123

So it is unclear to me what was resolved.

In the article you mentioned

(http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen)

Someone mentions:

"Can you also get Kyle to rerun tests from his end and update his old
posting
https://aphyr.com/posts/294-jepsen-cassandra

It would be great validation for the community."

I have the same question. Reading though all this material would the LWTs
pass the "linearizability" test but forth in above blog.

Mime
View raw message