cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-14480) Digest mismatch requires all replicas to be responsive
Date Mon, 18 Jun 2018 18:36:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516134#comment-16516134
] 

Jeff Jirsa commented on CASSANDRA-14480:
----------------------------------------

If it's a dupe (and it looks like it may be), then you have good news and bad news.

The good news is that 10726 is patch-available.
The bad news is it's a major refactor that won't land until 4.0

If you're satisfied it's a dupe, please feel free to relate+close it.


> Digest mismatch requires all replicas to be responsive
> ------------------------------------------------------
>
>                 Key: CASSANDRA-14480
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Christian Spriegel
>            Priority: Major
>         Attachments: Reader.java, Writer.java, schema_14480.cql
>
>
> I ran across a scenario where a digest mismatch causes a read-repair that requires
all up nodes to be able to respond. If one of these nodes is not responding, then the read-repair
is being reported to the client as ReadTimeoutException.
>  
> My expection would be that a CL=QUORUM will always succeed as long as 2 nodes are responding.
But unfortunetaly the third node being "up" in the ring, but not being able to respond does
lead to a RTE.
>  
>  
> I came up with a scenario that reproduces the issue:
>  # set up a 3 node cluster using ccm
>  # increase the phi_convict_threshold to 16, so that nodes are permanently reported as
up
>  # create attached schema
>  # run attached reader&writer (which only connects to node1&2). This should already
produce digest mismatches
>  # do a "ccm node3 pause"
>  # The reader will report a read-timeout with consistency QUORUM (2 responses were required
but only 1 replica responded). Within the DigestMismatchException catch-block it can be seen
that the repairHandler is waiting for 3 responses, even though the exception says that 2 responses
are required.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message