cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10089) NullPointerException in Gossip handleStateNormal
Date Mon, 19 Oct 2015 05:22:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962849#comment-14962849
] 

Stefania commented on CASSANDRA-10089:
--------------------------------------

I've rebased all 3 branches and started a new set of jobs to see if we can reproduce the 2.2
problem highlighted above. I spent a couple of hours trying to reproduce it locally but I
could not. We need TRACE level, at least for Gossiper.

I've attached the _multiple_repair_test_ log files that are available on Jenkins. Despite
having debug in their name they do not contain debug information unfortunately. It looks like
node 1 and node 3 were more or less in the same stage of setting their Gossip tokens, which
they had just randomly generated, right at the beginning after starting up. I could not deduce
why however node 3 did not send its tokens to node 1, it's really difficult to say without
Gossip trace information. From code inspection this should never happen.



> NullPointerException in Gossip handleStateNormal
> ------------------------------------------------
>
>                 Key: CASSANDRA-10089
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10089
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 2.1.x, 2.2.x, 3.0.x
>
>         Attachments: node1_debug.log, node2_debug.log, node3_debug.log
>
>
> Whilst comparing dtests for CASSANDRA-9970 I found [this failing dtest|http://cassci.datastax.com/view/Dev/view/blerer/job/blerer-9970-dtest/lastCompletedBuild/testReport/consistency_test/TestConsistency/short_read_test/]
in 2.2:
> {code}
> Unexpected error in node1 node log: ['ERROR [GossipStage:1] 2015-08-14 15:39:57,873 CassandraDaemon.java:183
- Exception in thread Thread[GossipStage:1,5,main] java.lang.NullPointerException: null \tat
org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1731)
~[main/:na] \tat org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1804)
~[main/:na] \tat org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1857)
~[main/:na] \tat org.apache.cassandra.service.StorageService.onChange(StorageService.java:1629)
~[main/:na] \tat org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2312)
~[main/:na] \tat org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1025)
~[main/:na] \tat org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1106) ~[main/:na]
\tat org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
~[main/:na] \tat org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
~[main/:na] \tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_80] \tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_80] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_80]']
> {code}
> I wasn't able to find it on unpatched branches  but it is clearly not related to CASSANDRA-9970,
if anything it could have been a side effect of CASSANDRA-9871.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message