cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan King (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-800) Spurious Gossip Up/Down and IO Errors
Date Wed, 17 Feb 2010 03:40:29 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834628#action_12834628
] 

Ryan King commented on CASSANDRA-800:
-------------------------------------

I'm skeptical about it too, but I've seen stranger effects from OOME's. We've made some config
changes to (hopefully) reduce heap size pressure. I'll let you know if that improves the situation
or now.

> Spurious Gossip Up/Down and IO Errors
> -------------------------------------
>
>                 Key: CASSANDRA-800
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-800
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5, 0.6, 0.7
>            Reporter: Ryan King
>            Assignee: Jaakko Laine
>             Fix For: 0.5
>
>
> We're seeing a lot of nodes flapping. It appears to possibly be a race condition in Gossip.
> on 10.209.23.110
> WARN [MESSAGING-SERVICE-POOL:2] 2010-02-13 01:18:22,976 TcpConnection.java (line 484)
Problem reading from socket connected to : java.nio.channels.SocketChannel[connected local=/10.209.23.110:7000
remote=/10.209.23.80:52720]
> WARN [MESSAGING-SERVICE-POOL:1] 2010-02-13 01:18:22,976 TcpConnection.java (line 484)
Problem reading from socket connected to : java.nio.channels.SocketChannel[connected local=/10.209.23.110:7000
remote=/10.209.23.80:36128]
>  WARN [MESSAGING-SERVICE-POOL:2] 2010-02-13 01:18:22,977 TcpConnection.java (line 485)
Exception was generated at : 02/13/2010 01:18:22 on thread MESSAGING-SERVICE-POOL:2
> Reached an EOL or something bizzare occured. Reading from: /10.209.23.80 BufferSizeRemaining:
16
> java.io.IOException: Reached an EOL or something bizzare occured. Reading from: /10.209.23.80
BufferSizeRemaining: 16
>     at org.apache.cassandra.net.io.StartState.doRead(StartState.java:44)
>     at org.apache.cassandra.net.io.ProtocolState.read(ProtocolState.java:39)
>     at org.apache.cassandra.net.io.TcpReader.read(TcpReader.java:95)
>     at org.apache.cassandra.net.TcpConnection$ReadWorkItem.run(TcpConnection.java:445)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:619)
> on 10.209.23.80 about the same time
> ERROR [pool-1-thread-4751] 2010-02-13 01:17:12,261 Cassandra.java (line 1096) Internal
error processing batch_insert
> java.util.ConcurrentModificationException
>     at java.util.HashMap$HashIterator.nextEntry(HashMap.java:848)
>     at java.util.HashMap$KeyIterator.next(HashMap.java:883)
>     at java.util.AbstractCollection.addAll(AbstractCollection.java:305)
>     at java.util.HashSet.<init>(HashSet.java:100)
>     at org.apache.cassandra.gms.Gossiper.getLiveMembers(Gossiper.java:173)
>     at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:120)
>     at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:78)
>     at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1186)
>     at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
>     at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
>     at org.apache.cassandra.service.CassandraServer.batch_insert(CassandraServer.java:445)
>     at org.apache.cassandra.service.Cassandra$Processor$batch_insert.process(Cassandra.java:1088)
>     at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
>     at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:619)
> just before that:
> INFO [Timer-1] 2010-02-13 01:17:12,070 Gossiper.java (line 194) InetAddress /10.209.21.223
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,257 Gossiper.java (line 194) InetAddress /10.209.21.217
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,257 Gossiper.java (line 194) InetAddress /10.209.21.216
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,258 Gossiper.java (line 194) InetAddress /10.209.21.215
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,258 Gossiper.java (line 194) InetAddress /10.209.23.82
is now dead.
> and just after that:
> INFO [Timer-1] 2010-02-13 01:17:12,261 Gossiper.java (line 194) InetAddress /10.209.23.81
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,293 Gossiper.java (line 194) InetAddress /10.209.23.79
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,304 Gossiper.java (line 194) InetAddress /10.209.21.204
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,307 Gossiper.java (line 194) InetAddress /10.209.21.197
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,308 Gossiper.java (line 194) InetAddress /10.209.21.245
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,309 Gossiper.java (line 194) InetAddress /10.209.21.242
is now dead.
> INFO [Timer-1] 2010-02-13 01:17:12,310 Gossiper.java (line 194) InetAddress /10.209.23.106
is now dead.
> INFO [GMFD:1] 2010-02-13 01:17:26,780 Log4jLogger.java (line 41) 02/13/2010 01:17:26
- Remaining bytes zero. Stopping deserialization in EndPointState.
> INFO [GMFD:1] 2010-02-13 01:17:26,784 Gossiper.java (line 543) InetAddress /10.209.21.204
is now UP
> INFO [GMFD:1] 2010-02-13 01:17:26,785 Gossiper.java (line 543) InetAddress /10.209.23.106
is now UP
> INFO [GMFD:1] 2010-02-13 01:17:26,786 Gossiper.java (line 543) InetAddress /10.209.21.197
is now UP
> INFO [GMFD:1] 2010-02-13 01:17:26,800 Gossiper.java (line 543) InetAddress /10.209.21.216
is now UP
> INFO [GMFD:1] 2010-02-13 01:17:41,808 Gossiper.java (line 543) InetAddress /10.209.21.217
is now UP
> INFO [GMFD:1] 2010-02-13 01:17:41,823 Gossiper.java (line 543) InetAddress /10.209.21.223
is now UP
> INFO [GMFD:1] 2010-02-13 01:17:41,823 Gossiper.java (line 543) InetAddress /10.209.21.215
is now UP
> We're on 298a0e66ba66c5d2a1e5d4a70f2f619ae3fbf72a from git.apache.org, which claims to
be:
> git-svn-id: https://svn.apache.org/repos/asf/incubator/cassandra/branches/cassandra-0.5@9035

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message