cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sutanu das (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-13093) 2.2.8 Node goes down with MUTATION messages were dropped in last 5000 ms: 29 for internal timeout
Date Tue, 03 Jan 2017 19:21:58 GMT
sutanu das created CASSANDRA-13093:
--------------------------------------

             Summary: 2.2.8 Node goes down with MUTATION messages were dropped in last 5000
ms: 29 for internal timeout
                 Key: CASSANDRA-13093
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13093
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: 2.2.8 Cassandra on 4 Nodes with Red Hat Linux 6.2 64 Bit
            Reporter: sutanu das
            Priority: Critical


Issue: 1st Node of 4 Node in Cluster keeps aborting (jvm crashing) with following messages:

- ReadTimeoutException: Operation timed out - received only 0 responses
- MUTATION messages were dropped in last 5000 ms: 29 for internal timeout and 0 for cross
node timeout
- Spark Jobs getting Q'd up when opening Channels, followed up Read Time Outs:
	ERROR [SharedPool-Worker-207] 2017-01-03 16:39:00,493 Message.java:611 - Unexpected exception
during request; channel = [id: 0xd0b0d36d, /216.12.229.180:41896 :> /172.17.30.47:9042]
	java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
timed out - received only 0 responses.
	
What has been done so far?

 - Host Reboot node 01
 - Mutiple C* restarts
 - Increased read_request_timeout_in_ms from 10000 to 50000
 - Increased request_timeout_in_ms from 10000 to 50000
 - Changed following:
	concurrent_reads: 128
	concurrent_writes: 128
	concurrent_counter_writes: 128

 - Upgrade to 2.2.8 - All Nodes Sync with 2.2.8
 - All nodes have same Pass Auth Scheme (Node 03 was a mis-match and was fixed)
	- authenticator: org.apache.cassandra.auth.PasswordAuthenticator
	- authorizer: org.apache.cassandra.auth.CassandraAuthorizer

Full exception stack:

DEBUG [SharedPool-Worker-10] 2017-01-03 16:32:43,983 StorageProxy.java:1898 - Range slice
timeout; received 0 of 1 responses for range 1 of 1
INFO  [Service Thread] 2017-01-03 16:32:43,983 GCInspector.java:284 - ParNew GC in 247ms.
 CMS Old Gen: 3768220776 -> 3996971216; Par Eden Space: 1718091776 -> 0;
INFO  [Service Thread] 2017-01-03 16:32:43,983 StatusLogger.java:52 - Pool Name          
         Active   Pending      Completed   Blocked  All Time Blocked
DEBUG [SharedPool-Worker-26] 2017-01-03 16:32:43,984 FileCacheService.java:102 - Evicting
cold readers for /cassandra/data/system_auth/roles-5bc52802de2535edaeab188eecebb090/la-51-big-Data.db
DEBUG [SharedPool-Worker-28] 2017-01-03 16:32:43,986 AbstractQueryPager.java:89 - Got empty
set of rows, considering pager exhausted


INFO  [ScheduledTasks:1] 2017-01-03 16:39:00,473 MessagingService.java:946 - RANGE_SLICE messages
were dropped in last 5000 ms: 2 for internal timeout and 0 for cross node timeout
INFO  [Service Thread] 2017-01-03 16:39:00,476 StatusLogger.java:106 - sales.airwave_dwell_time_det_hr
                0,0
ERROR [SharedPool-Worker-207] 2017-01-03 16:39:00,493 Message.java:611 - Unexpected exception
during request; channel = [id: 0xd0b0d36d, /216.12.229.180:41896 :> /172.17.30.47:9042]
java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation
timed out - received only 0 responses.
        at org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:497)
~[apache-cassandra-2.2.8.jar:2.2.8]
        at org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:306)
~[apache-cassandra-2.2.8.jar:2.2.8]
        at org.apache.cassandra.service.ClientState.login(ClientState.java:269) ~[apache-cassandra-2.2.8.jar:2.2.8]
        at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:79)
~[apache-cassandra-2.2.8.jar:2.2.8]
        at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:507)
[apache-cassandra-2.2.8.jar:2.2.8]
        at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:401)
[apache-cassandra-2.2.8.jar:2.2.8]
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_65]
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
[apache-cassandra-2.2.8.jar:2.2.8]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.2.8.jar:2.2.8]
        at java.lang.Thread.run(Unknown Source) [na:1.8.0_65]
Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received
only 0 responses.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message