cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan King (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-800) Spurious Gossip Up/Down and IO Errors
Date Tue, 16 Feb 2010 19:49:27 GMT
Spurious Gossip Up/Down and IO Errors
-------------------------------------

                 Key: CASSANDRA-800
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-800
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.5
            Reporter: Ryan King
             Fix For: 0.5, 0.6, 0.7


We're seeing a lot of nodes flapping. It appears to possibly be a race condition in Gossip.

on 10.209.23.110

WARN [MESSAGING-SERVICE-POOL:2] 2010-02-13 01:18:22,976 TcpConnection.java (line 484) Problem
reading from socket connected to : java.nio.channels.SocketChannel[connected local=/10.209.23.110:7000
remote=/10.209.23.80:52720]
WARN [MESSAGING-SERVICE-POOL:1] 2010-02-13 01:18:22,976 TcpConnection.java (line 484) Problem
reading from socket connected to : java.nio.channels.SocketChannel[connected local=/10.209.23.110:7000
remote=/10.209.23.80:36128]
 WARN [MESSAGING-SERVICE-POOL:2] 2010-02-13 01:18:22,977 TcpConnection.java (line 485) Exception
was generated at : 02/13/2010 01:18:22 on thread MESSAGING-SERVICE-POOL:2
Reached an EOL or something bizzare occured. Reading from: /10.209.23.80 BufferSizeRemaining:
16
java.io.IOException: Reached an EOL or something bizzare occured. Reading from: /10.209.23.80
BufferSizeRemaining: 16
    at org.apache.cassandra.net.io.StartState.doRead(StartState.java:44)
    at org.apache.cassandra.net.io.ProtocolState.read(ProtocolState.java:39)
    at org.apache.cassandra.net.io.TcpReader.read(TcpReader.java:95)
    at org.apache.cassandra.net.TcpConnection$ReadWorkItem.run(TcpConnection.java:445)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)


on 10.209.23.80 about the same time


ERROR [pool-1-thread-4751] 2010-02-13 01:17:12,261 Cassandra.java (line 1096) Internal error
processing batch_insert
java.util.ConcurrentModificationException
    at java.util.HashMap$HashIterator.nextEntry(HashMap.java:848)
    at java.util.HashMap$KeyIterator.next(HashMap.java:883)
    at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
    at java.util.HashSet.<init>(HashSet.java:100)
    at org.apache.cassandra.gms.Gossiper.getLiveMembers(Gossiper.java:173)
    at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:120)
    at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:78)
    at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1186)
    at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
    at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
    at org.apache.cassandra.service.CassandraServer.batch_insert(CassandraServer.java:445)
    at org.apache.cassandra.service.Cassandra$Processor$batch_insert.process(Cassandra.java:1088)
    at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
    at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)


just before that:

INFO [Timer-1] 2010-02-13 01:17:12,070 Gossiper.java (line 194) InetAddress /10.209.21.223
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,257 Gossiper.java (line 194) InetAddress /10.209.21.217
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,257 Gossiper.java (line 194) InetAddress /10.209.21.216
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,258 Gossiper.java (line 194) InetAddress /10.209.21.215
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,258 Gossiper.java (line 194) InetAddress /10.209.23.82
is now dead.


and just after that:

INFO [Timer-1] 2010-02-13 01:17:12,261 Gossiper.java (line 194) InetAddress /10.209.23.81
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,293 Gossiper.java (line 194) InetAddress /10.209.23.79
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,304 Gossiper.java (line 194) InetAddress /10.209.21.204
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,307 Gossiper.java (line 194) InetAddress /10.209.21.197
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,308 Gossiper.java (line 194) InetAddress /10.209.21.245
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,309 Gossiper.java (line 194) InetAddress /10.209.21.242
is now dead.
INFO [Timer-1] 2010-02-13 01:17:12,310 Gossiper.java (line 194) InetAddress /10.209.23.106
is now dead.
INFO [GMFD:1] 2010-02-13 01:17:26,780 Log4jLogger.java (line 41) 02/13/2010 01:17:26 - Remaining
bytes zero. Stopping deserialization in EndPointState.
INFO [GMFD:1] 2010-02-13 01:17:26,784 Gossiper.java (line 543) InetAddress /10.209.21.204
is now UP
INFO [GMFD:1] 2010-02-13 01:17:26,785 Gossiper.java (line 543) InetAddress /10.209.23.106
is now UP
INFO [GMFD:1] 2010-02-13 01:17:26,786 Gossiper.java (line 543) InetAddress /10.209.21.197
is now UP
INFO [GMFD:1] 2010-02-13 01:17:26,800 Gossiper.java (line 543) InetAddress /10.209.21.216
is now UP
INFO [GMFD:1] 2010-02-13 01:17:41,808 Gossiper.java (line 543) InetAddress /10.209.21.217
is now UP
INFO [GMFD:1] 2010-02-13 01:17:41,823 Gossiper.java (line 543) InetAddress /10.209.21.223
is now UP
INFO [GMFD:1] 2010-02-13 01:17:41,823 Gossiper.java (line 543) InetAddress /10.209.21.215
is now UP


We're on 298a0e66ba66c5d2a1e5d4a70f2f619ae3fbf72a from git.apache.org, which claims to be:

git-svn-id: https://svn.apache.org/repos/asf/incubator/cassandra/branches/cassandra-0.5@9035

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message