cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Lerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-713) Stacktrace when node taken offline
Date Wed, 23 Jun 2010 23:56:50 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881966#action_12881966
] 

Jeff Lerman commented on CASSANDRA-713:
---------------------------------------

Hi all,

I just had this happen in Cassandra 0.6.1.    We're only running two nodes as of now and our
second one was barely accepting any requests and only being replicated to for the most part.
  The load went up to 9 consistently so we investigated and noticed its "Load" on nodetool
was 2x as large as our other instance.   I went and cleared out the data and commitlogs, set
autobootstrap to true and put it back in.

This is where our case gets funky...we noticed the other instance's load going up a lot and
saw that the one I just readded was not doing much.  After awhile of contemplating, I took
down the second one again.  Minutes later I found an open case about the anticompaction happening
before full bootstrapping occurs.  I found the data/stream dir on the working instance and
saw that it was complete...but I had already taken down the second one!  So I deleted the
stream dir to save space and figured I'd start the process again tomorrow.

A few hours later I am getting these Internal errors on writes:


ERROR [pool-1-thread-287117] 2010-06-23 19:16:51,754 Cassandra.java (line 1492) Internal error
processing insert
java.lang.NullPointerException

The cassandra is still running, so I could sigquit it if anyone is interested in this mystery.

Thanks,

Jeff

> Stacktrace when node taken offline
> ----------------------------------
>
>                 Key: CASSANDRA-713
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-713
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Ryan Daum
>            Assignee: Jaakko Laine
>             Fix For: 0.5
>
>
> I took a node offline last week and then attempted to re-bootstrap its token range with
a new cassandra install on the same IP. I made gossip forget about the node by restarting
all other instances, then brought up the new node. It said was bootstrapping, but it never
finished bootstrapping after several days. The node never showed up in the ring, but when
I take it offline, I get the following exception continually from all other nodes in the cluster:
> ERROR [pool-1-thread-8] 2010-01-18 21:01:32,405 Cassandra.java (line 1096) Internal error
processing batch_insert
> java.lang.NullPointerException
>         at org.apache.cassandra.dht.BigIntegerToken.compareTo(BigIntegerToken.java:38)
>         at org.apache.cassandra.dht.BigIntegerToken.compareTo(BigIntegerToken.java:23)
>         at java.util.Collections.indexedBinarySearch(Collections.java:215)
>         at java.util.Collections.binarySearch(Collections.java:201)
>         at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedMapForEndpoints(AbstractReplicationStrategy.java:130)
>         at org.apache.cassandra.locator.AbstractReplicationStrategy.getHintedEndpoints(AbstractReplicationStrategy.java:76)
>         at org.apache.cassandra.service.StorageService.getHintedEndpointMap(StorageService.java:1183)
>         at org.apache.cassandra.service.StorageProxy.insertBlocking(StorageProxy.java:169)
>         at org.apache.cassandra.service.CassandraServer.doInsert(CassandraServer.java:466)
>         at org.apache.cassandra.service.CassandraServer.batch_insert(CassandraServer.java:445)
>         at org.apache.cassandra.service.Cassandra$Processor$batch_insert.process(Cassandra.java:1088)
>         at org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:817)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> In addition, I get frequent UnavailableExceptions on the other nodes.
> I cannot remove the token range for this node because it never officially joined the
ring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message