incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Armel Luce <jaluc...@gmail.com>
Subject Re: Failed migration from 1.1.6 to 1.2.2
Date Thu, 14 Mar 2013 11:57:26 GMT
Hi Alain,

Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-5299

A patch is provided with this ticket.

Regards.

Jean Armel

2013/3/14 Alain RODRIGUEZ <arodrime@gmail.com>

> Hi
>
> We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2.
>
> This has been a disaster. I just switch one node to 1.2.2, updated its
> configuration (cassandra.yaml / cassandra-env.sh) and restart it.
>
> It resulted on error on all the 5 remaining 1.1.6 nodes :
>
> ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750
> AbstractCassandraDaemon.java (line 135) Exception in thread
> Thread[RequestResponseStage:2,5,main]
> java.io.IOError: java.io.EOFException
>         at
> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71)
>         at
> org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:155)
>         at
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)
>         at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>         at
> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:100)
>         at
> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:81)
>         at
> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64)
>         ... 6 more
>
> I had this a lot of times, and my entire cluster wasn't reachable by our 4
> clients (phpCassa, Hector, Cassie, Helenus)
>
> I decommissioned the 1.2.2 node to get our cluster answering queries. It
> worked.
>
> Then I tried to replace this node by a new C*1.1.6 one with the same token
> as the previous node decommissioned. The node joined the ring and before
> getting any data switch to normal status.
>
> In all the other nodes I had :
>
> ERROR [MutationStage:8] 2013-03-14 10:21:01,288
> AbstractCassandraDaemon.java (line 135) Exception in thread
> Thread[MutationStage:8,5,main]
> java.lang.AssertionError
>         at
> org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:304)
>         at
> org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:371)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
>
> So I decommissioned this new 1.1.6 node and we are now running with 5
> servers, not balanced along the ring, without any possibility of adding
> nodes, nor upgradinc C* version.
>
> We are quite desperate over here.
>
> If someone has any idea of what could happened and how to stabilize the
> cluster, it will be very appreciated.
>
> It's quite an emergency since we can't add nodes and are under heavy load.
>
>

Mime
View raw message