cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dikang Gu <dikan...@gmail.com>
Subject Re: Definition of QUORUM consistency level
Date Thu, 29 Jun 2017 00:17:15 GMT
https://issues.apache.org/jira/browse/CASSANDRA-13645

On Wed, Jun 28, 2017 at 4:59 PM, Dikang Gu <dikang85@gmail.com> wrote:

> We implement the patch internally, and deploy to our production clusters,
> we see 2X drop of the P99 quorum read latency, because we can reduce one
> unnecessary cross region read. This is a huge improvement since performance
> is very critical to our customers.
>
> Again, I'm not trying to change the definition of the QUORUM consistency
> level, instead, we want to improve the quorum read latency, by removing
> unnecessary replica requests, which I think can benefit Cassandra users in
> general.
>
> I will create a JIRA, and we can move discussions there.
>
>
> Thanks!
> ‚Äč
>
> On Thu, Jun 8, 2017 at 10:12 PM, Jeff Jirsa <jjirsa@gmail.com> wrote:
>
>> Short of actually making ConsistencyLevel pluggable or adding/changing
>> one of the existing levels, an alternative approach would be to divide up
>> the cluster into either real or pseudo-datacenters (with RF=2 in each DC),
>> and then write with QUORUM (which would be 3 nodes, across any combination
>> of datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
>> datacenter of the coordinator). You don't have to have distinct physical
>> DCs for this, but you'd need tooling to guarantee an even number of
>> replicas in each virtual datacenter.
>>
>> It's an ugly workaround, but it'd work.
>>
>> Pluggable CL would be nicer, though.
>>
>>
>> On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <justin@instaclustr.com>
>> wrote:
>>
>>> Firstly, this situation only occurs if you need strong consistency and
>>> are
>>> using an even replication factor (RF4, RF6, etc).
>>> Secondly, either the read or write still need to be performed at a
>>> minimum
>>> level of QUORUM. This means there are no extra availability benefits from
>>> your proposal (i.e. a minimum of QUORUM replicas still need to be online
>>> and available)
>>>
>>> So the only potential benefit I can think of is a theoretical performance
>>> boost. If you write with QUORUM, then you'll need to read with
>>> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
>>> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
>>> most you'd only reduce the number of replicas that the client needs to
>>> block on by 1.
>>>
>>> I'd guess that the performance benefits that you'd gain will probably be
>>> quite small - but I'd happily be proven wrong if you feel like running
>>> some
>>> benchmarks :)
>>>
>>> Cheers,
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <driftx@gmail.com> wrote:
>>>
>>> > I don't disagree with you there and have never liked TWO/THREE.  This
>>> is
>>> > somewhat relevant: https://issues.apache.org/jira
>>> /browse/CASSANDRA-2338
>>> >
>>> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
>>> I'm
>>> > also not sure what is.
>>> >
>>> >
>>> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <dikang85@gmail.com> wrote:
>>> >
>>> >> To me, CL.TWO and CL.THREE are more like work around of the problem,
>>> for
>>> >> example, they do not work if the number of replicas go to 8, which
>>> does
>>> >> possible in our environment (2 replicas in each of 4 DCs).
>>> >>
>>> >> What people want from quorum is strong consistency guarantee, as long
>>> as
>>> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
>>> W=(n/2+1); c)
>>> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
>>> which
>>> >> is the most expensive option.
>>> >>
>>> >> I can not think of a reason, that people want the quorum read, not for
>>> >> strong consistency reason, but just to read from (n/2+1) nodes. If
>>> they
>>> >> want strong consistency, then the read just needs (n/2) nodes, we are
>>> >> purely waste the one extra request, and hurts read latency as well.
>>> >>
>>> >> Thanks
>>> >> Dikang.
>>> >>
>>> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <nate@thelastpickle.com>
>>> >> wrote:
>>> >>
>>> >>>
>>> >>> We have CL.TWO.
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>> This was actually the original motivation for CL.TWO and CL.THREE
if
>>> >>> memory serves:
>>> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dikang
>>> >>
>>> >>
>>> > --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> <https://www.instaclustr.com/>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia)
>>> and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not
>>> copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the
>>> message.
>>>
>>
>>
>
>
> --
> Dikang
>
>


-- 
Dikang

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message