incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Failed migration from 1.1.6 to 1.2.2
Date Thu, 14 Mar 2013 12:03:04 GMT
Thanks for this pointer but I don't think this is the source of our problem
since we use 1 data center and Ec2Snitch.



2013/3/14 Jean-Armel Luce <jaluce06@gmail.com>

> Hi Alain,
>
> Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-5299
>
> A patch is provided with this ticket.
>
> Regards.
>
> Jean Armel
>
>
> 2013/3/14 Alain RODRIGUEZ <arodrime@gmail.com>
>
>> Hi
>>
>> We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2.
>>
>> This has been a disaster. I just switch one node to 1.2.2, updated its
>> configuration (cassandra.yaml / cassandra-env.sh) and restart it.
>>
>> It resulted on error on all the 5 remaining 1.1.6 nodes :
>>
>> ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750
>> AbstractCassandraDaemon.java (line 135) Exception in thread
>> Thread[RequestResponseStage:2,5,main]
>> java.io.IOError: java.io.EOFException
>>         at
>> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71)
>>         at
>> org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:155)
>>         at
>> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)
>>         at
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:662)
>> Caused by: java.io.EOFException
>>         at java.io.DataInputStream.readFully(DataInputStream.java:180)
>>         at
>> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:100)
>>         at
>> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:81)
>>         at
>> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64)
>>         ... 6 more
>>
>> I had this a lot of times, and my entire cluster wasn't reachable by our
>> 4 clients (phpCassa, Hector, Cassie, Helenus)
>>
>> I decommissioned the 1.2.2 node to get our cluster answering queries. It
>> worked.
>>
>> Then I tried to replace this node by a new C*1.1.6 one with the same
>> token as the previous node decommissioned. The node joined the ring and
>> before getting any data switch to normal status.
>>
>> In all the other nodes I had :
>>
>> ERROR [MutationStage:8] 2013-03-14 10:21:01,288
>> AbstractCassandraDaemon.java (line 135) Exception in thread
>> Thread[MutationStage:8,5,main]
>> java.lang.AssertionError
>>         at
>> org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:304)
>>         at
>> org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:371)
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>         at java.lang.Thread.run(Thread.java:662)
>>
>> So I decommissioned this new 1.1.6 node and we are now running with 5
>> servers, not balanced along the ring, without any possibility of adding
>> nodes, nor upgradinc C* version.
>>
>> We are quite desperate over here.
>>
>> If someone has any idea of what could happened and how to stabilize the
>> cluster, it will be very appreciated.
>>
>> It's quite an emergency since we can't add nodes and are under heavy load.
>>
>>
>

Mime
View raw message