cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Daum (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-713) Stacktrace when node taken offline
Date Thu, 21 Jan 2010 01:48:54 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803120#action_12803120
] 

Ryan Daum commented on CASSANDRA-713:
-------------------------------------

Replication strategy is RackUnaware, replication factor 3, RandomPartitioner, 6 nodes.

I did restart the nodes at least partially one by one, so that may be part of the issue.

Today I had the necessity to remove a bunch of data we are no longer using, so I used that
opportunity to bring all nodes down, delete the largest keyspace and all commitlogs (didn't
need the data in them), and bring them back up in an orderly fashion. This time the errant
node bootstrapped correctly, so I assume this is related to gossip holding onto a memory of
the node.

Still, this seems like a bug -- I just wish I could give you a better recipe on how to reproduce
it.

> Stacktrace when node taken offline
> ----------------------------------
>
>                 Key: CASSANDRA-713
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-713
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Ryan Daum
>            Assignee: Jaakko Laine
>             Fix For: 0.5
>
>
> I took a node offline last week and then attempted to re-bootstrap its token range with
a new cassandra install on the same IP. I made gossip forget about the node by restarting
all other instances, then brought up the new node. It said was bootstrapping, but it never
finished bootstrapping after several days. The node never showed up in the ring, but when
I take it offline, I get the following exception continually from all other nodes in the cluster:
> ERROR [pool-1-thread-8] 2010-01-18 21:01:32,405 Cassandra.java (line 1096) Internal error
processing batch_insert
> java.lang.NullPointerException
>         at org.apache.cassandra.dht.BigIntegerToken.compareTo(BigIntegerToken.java:38)
>         at org.apache.cassandra.dht.BigIntegerToken.compareTo(BigIntegerToken.java:23)
>         at java.util.Collections.indexedBinarySearch(Collections.java:215)
>         at java.util.Collections.binarySearch(Collections.java:201)
>         at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:130)
>         at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
>         at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1183)
>         at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
>         at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
>         at org.apache.cassandra.service.CassandraServer.batch_insert(CassandraServer.java:445)
>         at org.apache.cassandra.service.Cassandra$Processor$batch_insert.process(Cassandra.java:1088)
>         at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> In addition, I get frequent UnavailableExceptions on the other nodes.
> I cannot remove the token range for this node because it never officially joined the
ring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message