Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A03301858E for ; Mon, 19 Oct 2015 05:22:05 +0000 (UTC) Received: (qmail 45737 invoked by uid 500); 19 Oct 2015 05:22:05 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 45700 invoked by uid 500); 19 Oct 2015 05:22:05 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 45688 invoked by uid 99); 19 Oct 2015 05:22:05 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Oct 2015 05:22:05 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 09E602C1F5E for ; Mon, 19 Oct 2015 05:22:05 +0000 (UTC) Date: Mon, 19 Oct 2015 05:22:05 +0000 (UTC) From: "Stefania (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-10089) NullPointerException in Gossip handleStateNormal MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-10089?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D14962849#comment-14962849 ]=20 Stefania commented on CASSANDRA-10089: -------------------------------------- I've rebased all 3 branches and started a new set of jobs to see if we can = reproduce the 2.2 problem highlighted above. I spent a couple of hours tryi= ng to reproduce it locally but I could not. We need TRACE level, at least f= or Gossiper. I've attached the _multiple_repair_test_ log files that are available on Je= nkins. Despite having debug in their name they do not contain debug informa= tion unfortunately. It looks like node 1 and node 3 were more or less in th= e same stage of setting their Gossip tokens, which they had just randomly g= enerated, right at the beginning after starting up. I could not deduce why = however node 3 did not send its tokens to node 1, it's really difficult to = say without Gossip trace information. From code inspection this should neve= r happen. > NullPointerException in Gossip handleStateNormal > ------------------------------------------------ > > Key: CASSANDRA-10089 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1008= 9 > Project: Cassandra > Issue Type: Bug > Reporter: Stefania > Assignee: Stefania > Fix For: 2.1.x, 2.2.x, 3.0.x > > Attachments: node1_debug.log, node2_debug.log, node3_debug.log > > > Whilst comparing dtests for CASSANDRA-9970 I found [this failing dtest|ht= tp://cassci.datastax.com/view/Dev/view/blerer/job/blerer-9970-dtest/lastCom= pletedBuild/testReport/consistency_test/TestConsistency/short_read_test/] i= n 2.2: > {code} > Unexpected error in node1 node log: ['ERROR [GossipStage:1] 2015-08-14 15= :39:57,873 CassandraDaemon.java:183 - Exception in thread Thread[GossipStag= e:1,5,main] java.lang.NullPointerException: null \tat org.apache.cassandra.= service.StorageService.getApplicationStateValue(StorageService.java:1731) ~= [main/:na] \tat org.apache.cassandra.service.StorageService.getTokensFor(St= orageService.java:1804) ~[main/:na] \tat org.apache.cassandra.service.Stora= geService.handleStateNormal(StorageService.java:1857) ~[main/:na] \tat org.= apache.cassandra.service.StorageService.onChange(StorageService.java:1629) = ~[main/:na] \tat org.apache.cassandra.service.StorageService.onJoin(Storage= Service.java:2312) ~[main/:na] \tat org.apache.cassandra.gms.Gossiper.handl= eMajorStateChange(Gossiper.java:1025) ~[main/:na] \tat org.apache.cassandra= .gms.Gossiper.applyStateLocally(Gossiper.java:1106) ~[main/:na] \tat org.ap= ache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbH= andler.java:49) ~[main/:na] \tat org.apache.cassandra.net.MessageDeliveryTa= sk.run(MessageDeliveryTask.java:66) ~[main/:na] \tat java.util.concurrent.T= hreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_80] \t= at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja= va:615) ~[na:1.7.0_80] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.7.= 0_80]'] > {code} > I wasn't able to find it on unpatched branches but it is clearly not rel= ated to CASSANDRA-9970, if anything it could have been a side effect of CAS= SANDRA-9871. -- This message was sent by Atlassian JIRA (v6.3.4#6332)