cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Spriegel (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-7868) Sporadic CL switch from LOCAL_QUORUM to ALL
Date Thu, 03 Aug 2017 14:49:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107084#comment-16107084
] 

Christian Spriegel edited comment on CASSANDRA-7868 at 8/3/17 2:48 PM:
-----------------------------------------------------------------------

[~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in
C* 3.0.13.

Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side.
I would expect that a failing Read-Repair does not throw an exception on the client. Is my
expectation incorrect?

Edit: It seems that StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch()
is the culprit. It indeed does a blocking repair. I assume it is necessary that it is blocking,
but does it really have to be a CL.ALL ?

Edit 2:
I think I understand now why this is an issue: Due to speculative-retry, contactedReplicas
may contain more nodes than expected by the queries CL. A digest-mismatch will then cause
a CL.ALL query on all the contacted nodes (including the ones from speculative retry).
I think this RR code needs to be improved to honor the query-CL when speculative retry was
performed for the query. 



was (Author: christianmovi):
[~brandon.williams]: Sorry to warm up this old ticket, but we are having the same issue in
C* 3.0.13.

Are sure this is so harmless? The ReadTimeoutException is being thrown on the client side.
I would expect that a failing Read-Repair does not throw an exception on the client. Is my
expectation incorrect?

Edit: It seems that StorageProxy.SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch()
is the culprit. It indeed does a blocking repair. I assume it is necessary that it is blocking,
but does it really have to be a CL.ALL ?

> Sporadic CL switch from LOCAL_QUORUM to ALL
> -------------------------------------------
>
>                 Key: CASSANDRA-7868
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7868
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Client: cassandra-java-driver 2.0.4
> Server: 2.0.9
>            Reporter: Dmitry Schitinin
>
> Hi!
> We have keyspace described as
> {code}
> CREATE KEYSPACE subscriptions WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'FOL': '3',
>   'SAS': '3',
>   'AMS': '0',
>   'IVA': '3',
>   'UGR': '0'
> } AND durable_writes = 'false';
> {code}
> There is simple table 
> {code}
> CREATE TABLE processed_documents (
>   id text,
>   PRIMARY KEY ((id))
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   index_interval=128 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   default_time_to_live=0 AND
>   speculative_retry='99.0PERCENTILE' AND
>   memtable_flush_period_in_ms=0 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> {code}
> in the keyspace.
> On client we execute next prepared statement:
> {code}
> session.prepare(
>     "SELECT id FROM processed_documents WHERE id IN :ids
> ).setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM)
> {code}
> Used Cassandra session has next main properties:
>   * Load balancing policy - DCAwareRoundRobinPolicy(localDc, usedHostPerRemoteDc = 3,
allowRemoteDcForLocalConsistencyLevel = true)
>   * Retry policy - DefaultRetryPolicy
>   * Query options - QueryOptions with set consistency level to ConsistencyLevel.LOCAL_QUORUM
> Our problem is next.
> Since some moment there are next errors in the client application log:
> {code}
> com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read
query at consistency ALL (9 responses were required but only 8 replica responded)
>         at com.datastax.driver.core.exceptions.ReadTimeoutException.copy(ReadTimeoutException.java:69)
~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.Responses$Error.asException(Responses.java:94) ~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:108)
~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.RequestHandler.onSet(RequestHandler.java:379) ~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.Connection$Dispatcher.messageReceived(Connection.java:571)
~[cassandra-driver-core-2.0.2.jar:na]
>         at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:70)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
~[netty-3.9.0.Final.jar:na]
>         at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
~[netty-3.9.0.Final.jar:na]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_51]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_51]
>         at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_51]
> Caused by: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout
during read query at consistency ALL (9 responses were required but only 8 replica responded)
>         at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:57) ~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:34) ~[cassandra-driver-core-2.0.2.jar:na]
>         at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:182)
~[cassandra-driver-core-2.0.2.jar:na]
>         at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:66)
~[netty-3.9.0.Final.jar:na]
>         ... 25 common frames omitted
> {code}
> Some error records inform us about "9 responses were required but only 8 replica responded"
and other records are about "3 responses were required but only 2 replica responded".
> After the application starts to produce these errors only its restart is efficient.
> It seems that server set CL.ALL to query, since there are no code in java driver that
can do it.
> Besides there are no remarkable in Cassandra nodes logs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message