cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds
Date Tue, 15 Sep 2015 02:36:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744717#comment-14744717
] 

Stefania commented on CASSANDRA-8072:
-------------------------------------

Without having looked in much detail at this yet, it seems very similar to CASSANDRA-10205.
It's worth trying out that patch to see if it solves this as well.

Summary of 10205: after a node is decommissioned, its data wiped, and the node restarted,
it never receives GOSSIP replies from seeds during the shadow round and fails to start. This
is because the node is not marked as dead and the socket is not closed properly. Marking the
node as dead even when the status is LEFT, closes the socket and the test of 10205 passes.
The patch is for 3.0 but the same problem occurs on 2.0+.

> Exception during startup: Unable to gossip with any seeds
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-8072
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Ryan Springer
>            Assignee: Stefania
>             Fix For: 2.1.x
>
>         Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2,
cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, casandra-system-log-with-assert-patch.log, screenshot-1.png,
trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster in either
ec2 or locally, an error occurs sometimes with one of the nodes refusing to start C*.  The
error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) Exception encountered
during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
>         at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
>         at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
>         at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
>         at org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
>         at org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
>         at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
>         at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
>         at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java (line 1279)
Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 MessagingService.java (line
701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 MessagingService.java (line
941) MessagingService has terminated the accept() thread
> This errors does not always occur when provisioning a 2-node cluster, but probably around
half of the time on only one of the nodes.  I haven't been able to reproduce this error with
DSC 2.0.9, and there have been no code or definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed fixes
since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message