cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12103) Cassandra is hang and cqlsh was not able to login with OperationTimeout error
Date Wed, 29 Jun 2016 06:59:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354693#comment-15354693
] 

Sam Tunnicliffe commented on CASSANDRA-12103:
---------------------------------------------

This indicates a failure in the node attempting to read credentials for the user your client
is trying to authenticate as. The most likely cause is that the replication factor for the
{{system_auth}} is too low and the replica holding those credentials is unreachable. I suspect
you're using {{SimpleStrategy}}, the replica responsible for those credentials is in DC2 and
there was some inter-dc connection problem. 

Can you confirm: 
1) What is the replication strategy config for {{system_auth}}?
2) Do/did you have any nodes down at the time?
3) Are your clients attempting to log in using the default superuser login (username cassandra),
as credentials for this user are read at a higher consistency level (QUORUM, rather than LOCAL_ONE
as for all other users)?


> Cassandra is hang and cqlsh was not able to login with OperationTimeout error
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12103
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12103
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core, Local Write-Read Paths
>         Environment: centos 6.5 cassandra 2.1.9
>            Reporter: peng xiao
>            Priority: Critical
>         Attachments: system.log.2016-06-28_1257.gz
>
>
> Hi,
> We have two DCs(DC1 and DC2) with DC1 3 nodes and DC2 9 nodes.
> And we experienced a Timeout error today,all applications connected to DC1 were hang
and no response,even cqlsh was not able to log into any node in DC1.
> I restarted the 3 nodes in DC1,the problem was not resolved.
> Then we switched to DC2,then applications back to normal.
> Could you please help to take a look?
> Thanks
> many errors like below:
> ERROR [SharedPool-Worker-43] 2016-06-28 11:58:49,705 Message.java:538 - Unexpected exception
during request; channel = [id: 0x87e315d6, /172.16.10.198:13604 => /172.16.11.13:9042]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
timed out - received only 0 responses.
>         at org.apache.cassandra.auth.Auth.selectUser(Auth.java:276) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:86) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.service.ClientState.login(ClientState.java:206) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:82)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
[apache-cassandra-2.1.9.jar:2.1.9]
>         at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0]
>         at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.9.jar:2.1.9]
>         at java.lang.Thread.run(Thread.java:744) [na:1.8.0]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message