cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12735) org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out
Date Fri, 30 Sep 2016 16:19:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536381#comment-15536381
] 

Sam Tunnicliffe commented on CASSANDRA-12735:
---------------------------------------------

The stacktrace shows that when the permissions cache was accessed, there was a cache miss.
In that case, we have no choice but to synchronously fetch the permissions, which are then
cached for subsequent access. Reading the permissions for a logged in user involves fetching
the roles granted to that user, which also goes via a similar cache. As it happens, that cache
access also resulted in a miss, so the role info also had to be synchronously read from disk.
Once an item has been cached, we can (& do) refresh it asynchronously so that users of
hot items don't periodically get interrupted by a synchronous reload when the item eventually
expires from the cache. So, the stacktrace indicates that either - this was the very first
attempt to access those particular cache items, or that they had been loaded, then expired
leading to a cache miss. 

I would suggest setting the validity period > update interval to minimize the chances of
regularly (or semi-regularly) read items expiring. Of course, this is a trade off as the longer
you set the validity period, the longer you may serve stale entries if you drop roles or grant/revoke
permissions. Likewise, lowering the update interval will likely lead to increased read activity
on the system_auth tables as the cache is refreshed more often. 

Lastly, the fact that a single partition read from the roles table timed out at all suggest
that your cluster is struggling under the workload. There's nothing really special about those
reads, so it they can fail that way, the odds are other things will start to go south too.



> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12735
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12735
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Configuration, Core
>         Environment: Python 2.7.11, Datastax Cassandra 3.7.0  
>            Reporter: Rajesh Radhakrishnan
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.7
>
>
> We got a cluster of two nodes running Cassandra.3.7.0 and using client running Python
2.7.11 injecting lot of data from maybe 100 or so jobs. 
> ------------------------------------------------------------------------------------------
> Cache setting can be seen from system.log:
> INFO  [main] 2016-09-30 15:12:50,002 AuthCache.java:172 - (Re)initializing CredentialsCache
(validity period/update interval/max entries) (2000/2000/1000)
> INFO  [SharedPool-Worker-1] 2016-09-30 15:15:09,561 AuthCache.java:172 - (Re)initializing
PermissionsCache (validity period/update interval/max entries) (10000/10000/1000)
> INFO  [SharedPool-Worker-1] 2016-09-30 15:15:24,319 AuthCache.java:172 - (Re)initializing
RolesCache (validity period/update interval/max entries) (5000/5000/1000)
> ===========================================================
> But I am getting the following exception :
> ERROR [SharedPool-Worker-90] 2016-09-30 15:17:20,883 ErrorMessage.java:338 - Unexpected
exception during request
> com.google.common.util.concurrent.UncheckedExecutionException: com.google.common.util.concurrent.UncheckedExecutionException:
java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
timed out - received only 0 responses.
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache.get(LocalCache.java:3937) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) ~[guava-18.0.jar:na]
> 	at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.PermissionsCache.getPermissions(PermissionsCache.java:45)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.AuthenticatedUser.getPermissions(AuthenticatedUser.java:104)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.ClientState.authorize(ClientState.java:375) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.ClientState.checkPermissionOnResourceChain(ClientState.java:308)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:285)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:272) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:256)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:211)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.cql3.statements.BatchStatement.checkAccess(BatchStatement.java:137)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:502) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.cql3.QueryProcessor.processBatch(QueryProcessor.java:495) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.transport.messages.BatchMessage.execute(BatchMessage.java:217)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
[apache-cassandra-3.7.0.jar:3.7.0]
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
[netty-all-4.0.36.Final.jar:4.0.36.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
[netty-all-4.0.36.Final.jar:4.0.36.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:32)
[netty-all-4.0.36.Final.jar:4.0.36.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:283)
[netty-all-4.0.36.Final.jar:4.0.36.Final]
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_73]
> 	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.7.0.jar:3.7.0]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_73]
> Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException:
org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only
0 responses.
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache.get(LocalCache.java:3937) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) ~[guava-18.0.jar:na]
> 	at org.apache.cassandra.auth.AuthCache.get(AuthCache.java:108) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.RolesCache.getRoles(RolesCache.java:44) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.Roles.hasSuperuserStatus(Roles.java:51) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:71) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:86)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.PermissionsCache.lambda$new$1(PermissionsCache.java:37)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.AuthCache$1.load(AuthCache.java:183) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) ~[guava-18.0.jar:na]
> 	... 26 common frames omitted
> Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException:
Operation timed out - received only 0 responses.
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:489)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRoles(CassandraRoleManager.java:269)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.RolesCache.lambda$new$195(RolesCache.java:36) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.AuthCache$1.load(AuthCache.java:183) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) ~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
~[guava-18.0.jar:na]
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) ~[guava-18.0.jar:na]
> 	... 40 common frames omitted
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out
- received only 0 responses.
> 	at org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1714)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1663) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1604) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1523) ~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:979)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:273)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:234)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:497)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:485)
~[apache-cassandra-3.7.0.jar:3.7.0]
> 	... 47 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message