cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Tunnicliffe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12103) Cassandra is hang and cqlsh was not able to login with OperationTimeout error
Date Wed, 29 Jun 2016 07:24:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354719#comment-15354719
] 

Sam Tunnicliffe commented on CASSANDRA-12103:
---------------------------------------------

>From the attached log, it's clear that that node is not able to keep up with the workload
and has become overwhelmed. Long GC pauses are happening almost constantly with read tasks
backing up. Hints are also being written for writes to other nodes, indicating those nodes
are in a similar state. If that's typical of the other nodes in the DC I'd say that the problem
is overload, and that the auth failure is really a symptom of the fact that the cluster (or
at least the nodes in DC1) can't keep up. 


> Cassandra is hang and cqlsh was not able to login with OperationTimeout error
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12103
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12103
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core, Local Write-Read Paths
>         Environment: centos 6.5 cassandra 2.1.9
>            Reporter: peng xiao
>            Priority: Critical
>         Attachments: system.log.2016-06-28_1257.gz
>
>
> Hi,
> We have two DCs(DC1 and DC2) with DC1 3 nodes and DC2 9 nodes.
> And we experienced a Timeout error today,all applications connected to DC1 were hang
and no response,even cqlsh was not able to log into any node in DC1.
> I restarted the 3 nodes in DC1,the problem was not resolved.
> Then we switched to DC2,then applications back to normal.
> Could you please help to take a look?
> Thanks
> many errors like below:
> ERROR [SharedPool-Worker-43] 2016-06-28 11:58:49,705 Message.java:538 - Unexpected exception
during request; channel = [id: 0x87e315d6, /172.16.10.198:13604 => /172.16.11.13:9042]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
timed out - received only 0 responses.
>         at org.apache.cassandra.auth.Auth.selectUser(Auth.java:276) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:86) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.service.ClientState.login(ClientState.java:206) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:82)
~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
[apache-cassandra-2.1.9.jar:2.1.9]
>         at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0]
>         at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.9.jar:2.1.9]
>         at java.lang.Thread.run(Thread.java:744) [na:1.8.0]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message