cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Dusbabek (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1160) race with insufficiently constructed Gossiper
Date Tue, 08 Jun 2010 19:05:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876766#action_12876766
] 

Gary Dusbabek commented on CASSANDRA-1160:
------------------------------------------

Intermittent test failures indicate you have a timing problem.

The fact that the assertion (the original intermittent problem) fails because a static class
member is null makes me think we're dealing with static initializers firing in the order we
don't suspect.  

I'd start by getting the inner classes (the verb handlers) out of Gossiper and into their
own classes.  Then there is the race between Gossiper and MessageService.  You could let the
gossiper start before the MessageService starts listening, but have the GossipTimerTask check
to make sure the MessageService is listening before it sends out any gossip requests.  This
still isn't going to protect us in the case of a node getting restarted where other nodes
already know about it and and diligently trying to contact what they thought was a dead node.


> race with insufficiently constructed Gossiper
> ---------------------------------------------
>
>                 Key: CASSANDRA-1160
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1160
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Matthew F. Dennis
>            Priority: Minor
>             Fix For: 0.6.3
>
>         Attachments: 0001-cassandra-0.6-1160.patch, 0002-cassandra-0.6-1160.patch
>
>
> Gossiper.start needs to be integrated into the constructor.  Currently you can have threads
using the gossiper instance before start finishes (or even starts?), resulting in tracebacks
like this:
> ERROR [GMFD:1] 2010-06-02 10:45:49,878 CassandraDaemon.java (line 78) Fatal exception
in thread Thread[GMFD:1,5,main]
> java.lang.AssertionError
> 	at org.apache.cassandra.net.Header.<init>(Header.java:56)
> 	at org.apache.cassandra.net.Header.<init>(Header.java:74)
> 	at org.apache.cassandra.net.Message.<init>(Message.java:58)
> 	at org.apache.cassandra.gms.Gossiper.makeGossipDigestAckMessage(Gossiper.java:294)
> 	at org.apache.cassandra.gms.Gossiper$GossipDigestSynVerbHandler.doVerb(Gossiper.java:935)
> 	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:40)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> ERROR [GMFD:2] 2010-06-02 10:45:49,880 CassandraDaemon.java (line 78) Fatal exception
in thread Thread[GMFD:2,5,main]
> java.lang.AssertionError
> 	at org.apache.cassandra.net.Header.<init>(Header.java:56)
> 	at org.apache.cassandra.net.Header.<init>(Header.java:74)
> 	at org.apache.cassandra.net.Message.<init>(Message.java:58)
> 	at org.apache.cassandra.gms.Gossiper.makeGossipDigestAckMessage(Gossiper.java:294)
> 	at org.apache.cassandra.gms.Gossiper$GossipDigestSynVerbHandler.doVerb(Gossiper.java:935)
> 	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:40)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message