cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Russell Alexander Spitzer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6485) NPE in calculateNaturalEndpoints
Date Mon, 16 Dec 2013 16:21:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13849275#comment-13849275
] 

Russell Alexander Spitzer commented on CASSANDRA-6485:
------------------------------------------------------

Patch worked on my test. 

> NPE in calculateNaturalEndpoints
> --------------------------------
>
>                 Key: CASSANDRA-6485
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6485
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Russell Alexander Spitzer
>            Assignee: Jonathan Ellis
>             Fix For: 1.2.13, 2.0.4
>
>         Attachments: 6485.txt
>
>
> I was running a test where I added a new data center to an existing cluster. 
> Test outline:
> Start 25 Node DC1
> Keyspace Setup Replication 3
> Begin insert against DC1 Using Stress
> While the inserts are occuring
> Start up 25 Node DC2
> Alter Keyspace to include Replication in 2nd DC
> Run rebuild on DC2
> Wait for stress to finish
> Run repair on Cluster
> ... Some other operations
> Although there are no issues with smaller clusters or clusters without vnodes, Larger
setups with vnodes seem to consistently see the following exception in the logs as well as
a write operation failing for each exception. Usually this happens between 1-8 times during
an experiment. 
> The exceptions/failures are Occurring when DC2 is brought online but *before* any alteration
of the Keyspace. All of the exceptions are happening on DC1 nodes. One of the exceptions occurred
on a seed node though this doesn't seem to be the case most of the time. 
> While the test was running, nodetool was run every second to get cluster status. At no
time did any nodes report themselves as down. 
> {code}
> ystem_logs-107.21.186.208/system.log-ERROR [Thrift:1] 2013-12-13 06:19:52,647 CustomTThreadPoolServer.java
(line 217) Error occurred during processing of message.
> system_logs-107.21.186.208/system.log:java.lang.NullPointerException
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:128)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2624)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:190)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:866)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:849)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:749)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3690)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3678)
> system_logs-107.21.186.208/system.log-	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> system_logs-107.21.186.208/system.log-	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> system_logs-107.21.186.208/system.log-	at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:199)
> system_logs-107.21.186.208/system.log-	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> system_logs-107.21.186.208/system.log-	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> system_logs-107.21.186.208/system.log-	at java.lang.Thread.run(Thread.java:724)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message