Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CA3CCE391 for ; Thu, 14 Mar 2013 15:10:47 +0000 (UTC) Received: (qmail 44365 invoked by uid 500); 14 Mar 2013 15:10:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 44114 invoked by uid 500); 14 Mar 2013 15:10:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 44095 invoked by uid 99); 14 Mar 2013 15:10:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 15:10:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of arodrime@gmail.com designates 209.85.215.46 as permitted sender) Received: from [209.85.215.46] (HELO mail-la0-f46.google.com) (209.85.215.46) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 15:10:36 +0000 Received: by mail-la0-f46.google.com with SMTP id fq12so2606141lab.19 for ; Thu, 14 Mar 2013 08:10:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=/vWIY4DfwqurvygHMAt4bpJJrwzHcdpMyIBlWKbibz0=; b=A5z3GAl/MLq4EMbHWhvPu53/IoNhCDTXsLSk6Dp4FXPNwm355GAVSOntxIUR2dhXWp hhL5WRfkZNzIuO4IImA47hKslcjnl4qRx8dDmzhSOUP02IdR4IiMz26VWXZIxWpN6nnd lWayw2/S84G9t5QU4iM9TmSIJtZ6B71qHrUkgWM5C86pPsNi3GySENfV/uAmwIYvYKZn gVyCNbEqSHDncjGmCV5kwxMfnTgcuU3XvpTnQgS00nXITBmNQCbY72vEg3pXOdHYbFrm DzIC7/aaFEAhZgv9vov4WOSwkeWarQt4L1g/GpKl6EvLNMDNzEjmO4sE37Oz3p2L5qGF IKtw== X-Received: by 10.112.84.228 with SMTP id c4mr1224079lbz.113.1363273815703; Thu, 14 Mar 2013 08:10:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.44.230 with HTTP; Thu, 14 Mar 2013 08:09:54 -0700 (PDT) In-Reply-To: References: From: Alain RODRIGUEZ Date: Thu, 14 Mar 2013 16:09:54 +0100 Message-ID: Subject: Re: Failed migration from 1.1.6 to 1.2.2 To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d04016b3b844c5c04d7e3ebc5 X-Virus-Checked: Checked by ClamAV on apache.org --f46d04016b3b844c5c04d7e3ebc5 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable @Dean "It is expensive?" I was talking about a full time QA environment equal or similar to a prod env. I didn't thought about using a temp QA, and you are right I should have. "And sorry for not providing the detail on the rolling restart not working=85.my bad" No problem, my point was just to remember you that other member of the community can use this kind of information. "but also I think people on the list assume you are going to do some basic testing if at least to get comfortable with the process" I did, but on a local machine. That's the hardware I had, so I just tested it on one machine and made sure the clients were compatible... But I wasn't aware of ccm. I will use it next time for sure :-). @Michal Thanks about ccm. "on my workstation with a < 0.01% sample of production" Is there a simple way of getting that ? @all Any idea why my node is not restarting now ? Same result with or without -Dcassandra.load_ring_state=3Dfalse. Last log lines before C* process end : INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,813 SSTableReader.java (line 169) Opening /raid0/cassandra/data/system/LocationInfo/system-LocationInfo-hf-70 (621 bytes) INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,819 SSTableReader.java (line 169) Opening /raid0/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-hf-= 465 (66 bytes) Should I $rm /raid0/cassandra/data/system/HintsColumnFamily/* ? 2013/3/14 Hiller, Dean > It is expensive?=85=85personally, sorry, I don't really buy that since I = spent > less than 400 bucks on 100 servers at amazon to play with for 1 or 2 hour= s > or maybe it was 8 hours=85I can't remember AND you can use small instance= s > for a test like this. You can write EC2 scripts to startup a QA system f= or > your needs very easily. Now, if your company is not allowing amazon, tha= t > is a different story and it is expensive. We have the same issue as > you=85.lack of time though we did get some VM's and put roughly 10MB in e= ach > to test out an upgrade. > > So a basic QA test equipment wise would cost only about 50 bucks and be > well worth the testing=85.the time effort would cost a bit more but usual= ly > companies are already paying the salaries and that was already budgeted f= or. > > And sorry for not providing the detail on the rolling restart not > working=85.my bad, but also I think people on the list assume you are goi= ng > to do some basic testing if at least to get comfortable with the process. > > Dean > > From: Alain RODRIGUEZ > > Reply-To: "user@cassandra.apache.org" < > user@cassandra.apache.org> > Date: Thursday, March 14, 2013 7:41 AM > To: "user@cassandra.apache.org" < > user@cassandra.apache.org> > Subject: Re: Failed migration from 1.1.6 to 1.2.2 > > @Aaron > > "You can try to reset the cluster ring state by doing a rolling restart > passing -Dcassandra.load_ring_state=3Dfalse as a JVM param in > cassandra-env.sh" > > Now my can't restart properly. I stop restarting and last logged message > is: > > INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,813 SSTableReader.java (lin= e > 169) Opening > /raid0/cassandra/data/system/LocationInfo/system-LocationInfo-hf-70 (621 > bytes) > INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,819 SSTableReader.java (lin= e > 169) Opening > /raid0/cassandra/data/system/HintsColumnFamily/system-HintsColumnFamily-h= f-465 > (66 bytes) > > Shoul I $rm /raid0/cassandra/data/system/HintsColumnFamily/* ? > > @Dean > > "You should really be testing this stuff in QA" > > We have no such environment. It is expensive, we can't afford this for no= w. > > "We had the exact same issue from 1.1.4 to 1.2.2." > > Well, I think you could have warned. I thought it was safe upgrading > because I saw that you and 2 more people did it with no major issues... > > > 2013/3/14 Hiller, Dean = > > You should really be testing this stuff in QA. We had the exact same > issue from 1.1.4 to 1.2.2. In QA, we decided we could take an outage so = we > tested taking every node down, upgrading every node and bringing the > cluster back online. This worked perfectly so we rolled it into > production=85.production took 45 minutes to start for us(especially one n= ode > under pressure)=85.that was only initially though=85now everything seems = fine. > Another option in QA was we could have tested upgrading to 1.1.9 first > then to 1.2.2. I have no idea if it will work but I am sure they test > closer release scenarios on upgrading more so than the big jump releases > > Aaron, it would be really neat if some releases were tagged with LT(long > term) or something so upgrades are tested from LT to LT releases so we kn= ow > we can always safely first upgrade to an LT release and then upgrade to > another LT release from that one=85just a thought. This would also get mo= re > people using/testing the same upgrade paths which would help everyone. > > Dean > > From: Alain RODRIGUEZ >>> > Reply-To: "user@cassandra.apache.org >>" < > user@cassandra.apache.org user@cassandra.apache.org>> > Date: Thursday, March 14, 2013 5:31 AM > To: "user@cassandra.apache.org user@cassandra.apache.org>" < > user@cassandra.apache.org user@cassandra.apache.org>> > Subject: Re: Failed migration from 1.1.6 to 1.2.2 > > We have it set to 0.0.0.0 but anyway, as told before, I don't think our > problem come from this bug. > > > 2013/3/14 Michal Michalski >>> > > It will happen if your rpc_address is set to 0.0.0.0. > > Ops, it's not what I meant ;-) > It will happen, if your rpc_address is set to IP that is not defined in > your cluster's config (e.g. in cassandra-topology.properties for > PropertyFileSnitch) > > > M. > > > M. > > W dniu 14.03.2013 13:03, Alain RODRIGUEZ pisze: > Thanks for this pointer but I don't think this is the source of our > problem > since we use 1 data center and Ec2Snitch. > > > > 2013/3/14 Jean-Armel Luce >>> > > Hi Alain, > > Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-5299 > > A patch is provided with this ticket. > > Regards. > > Jean Armel > > > 2013/3/14 Alain RODRIGUEZ >>> > > Hi > > We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2. > > This has been a disaster. I just switch one node to 1.2.2, updated its > configuration (cassandra.yaml / cassandra-env.sh) and restart it. > > It resulted on error on all the 5 remaining 1.1.6 nodes : > > ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750 > AbstractCassandraDaemon.java (line 135) Exception in thread > Thread[RequestResponseStage:2,5,main] > java.io.IOError: java.io.EOFException > at > > org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowRe= solver.java:71) > > at > org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:155) > > at > org.apache.cassandra.net< > http://org.apache.cassandra.net > >.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45) > > at > org.apache.cassandra.net< > http://org.apache.cassandra.net > >.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor= .java:886) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav= a:908) > > at java.lang.Thread.run(Thread.java:662) > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:180) > at > > org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.j= ava:100) > > at > > org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.j= ava:81) > > at > > org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowRe= solver.java:64) > > ... 6 more > > I had this a lot of times, and my entire cluster wasn't reachable by > our > 4 clients (phpCassa, Hector, Cassie, Helenus) > > I decommissioned the 1.2.2 node to get our cluster answering > queries. It > worked. > > Then I tried to replace this node by a new C*1.1.6 one with the same > token as the previous node decommissioned. The node joined the ring and > before getting any data switch to normal status. > > In all the other nodes I had : > > ERROR [MutationStage:8] 2013-03-14 10:21:01,288 > AbstractCassandraDaemon.java (line 135) Exception in thread > Thread[MutationStage:8,5,main] > java.lang.AssertionError > at > org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:30= 4) > > at > > org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java= :371) > > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor= .java:886) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav= a:908) > > at java.lang.Thread.run(Thread.java:662) > > So I decommissioned this new 1.1.6 node and we are now running with 5 > servers, not balanced along the ring, without any possibility of adding > nodes, nor upgradinc C* version. > > We are quite desperate over here. > > If someone has any idea of what could happened and how to stabilize the > cluster, it will be very appreciated. > > It's quite an emergency since we can't add nodes and are under heavy > load. > > > > > > > > --f46d04016b3b844c5c04d7e3ebc5 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
@Dean

"It is expensive?"<= div>
=
I was talking about a full time= QA=A0environment=A0equal or similar to a prod env.
=
I didn't thought about using a temp QA, and you are = right I should have.

"And sorry fo= r not providing the detail on the rolling restart not working=85.my bad&quo= t;

No problem, my point was just to remember you that ot= her member of the community can use this kind of information.

=
"but also I think people on the list assume you are going to = do some basic testing if at least to get comfortable with the process"=

I did, but on a local machine. That's the hardwar= e I had, so I just tested it on one machine and made sure the clients were = compatible... But I wasn't aware of ccm. I will use it next time for su= re :-).

=
@Michal

Thanks about ccm.

"on my workstation with a &= lt; 0.01% sample of production"

=
Is there a simple way of getting that ?

@all

Any idea why my node is not restarting n= ow ?

=
Same result with or without=A0-Dcassandra.load_ring_state=3Dfalse.

=
Last log lines before C* process end :

INFO [SSTableBatchOpen:1] 2013-03-= 14 14:36:09,813 SSTableReader.java (line 169) Opening /raid0/cassandra/data= /system/LocationInfo/system-LocationInfo-hf-70 (621 bytes)
INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,819 SSTableR= eader.java (line 169) Opening /raid0/cassandra/data/system/HintsColumnFamil= y/system-HintsColumnFamily-hf-465 (66 bytes)

Should I $rm=A0/r= aid0/cassandra/data/system/HintsColumnFamily/* ?







2013/3/14 Hiller, Dean <Dean.Hiller@nrel.gov>=
It is expensive?=85=85personally, sorry, I d= on't really buy that since I spent less than 400 bucks on 100 servers a= t amazon to play with for 1 or 2 hours or maybe it was 8 hours=85I can'= t remember AND you can use small instances for a test like this. =A0You can= write EC2 scripts to startup a QA system for your needs very easily. =A0No= w, if your company is not allowing amazon, that is a different story and it= is expensive. =A0We have the same issue as you=85.lack of time though we d= id get some VM's and put roughly 10MB in each to test out an upgrade.
So a basic QA test equipment wise would cost only about 50 bucks and be wel= l worth the testing=85.the time effort would cost a bit more but usually co= mpanies are already paying the salaries and that was already budgeted for.<= br>
And sorry for not providing the detail on the rolling restart not working= =85.my bad, but also I think people on the list assume you are going to do = some basic testing if at least to get comfortable with the process.
Date: Thursday, March 14, 2013 7:41 AM
To: "us= er@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Failed migration from 1.1.6 to 1.2.2

@Aaron

"You can try to reset the cluster ring state by doing a rolling restar= t passing -Dcassandra.load_ring_state=3Dfalse as a JVM param in cassandra-e= nv.sh"

Now my can't restart properly. I stop restarting and last logged messag= e is:

INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,813 SSTableReader.java (line = 169) Opening /raid0/cassandra/data/system/LocationInfo/system-LocationInfo-= hf-70 (621 bytes)
INFO [SSTableBatchOpen:1] 2013-03-14 14:36:09,819 SSTableReader.java (line = 169) Opening /raid0/cassandra/data/system/HintsColumnFamily/system-HintsCol= umnFamily-hf-465 (66 bytes)

Shoul I $rm /raid0/cassandra/data/system/HintsColumnFamily/* ?

@Dean

"You should really be testing this stuff in QA"

We have no such environment. It is expensive, we can't afford this for = now.

"We had the exact same issue from 1.1.4 to 1.2.2."

Well, I think you could have warned. I thought it was safe upgrading becaus= e I saw that you and 2 more people did it with no major issues...


2013/3/14 Hiller, Dean <De= an.Hiller@nrel.gov<mailto:De= an.Hiller@nrel.gov>>
You should really be testing this stuff in QA. =A0We had = the exact same issue from 1.1.4 to 1.2.2. =A0In QA, we decided we could tak= e an outage so we tested taking every node down, upgrading every node and b= ringing the cluster back online. =A0This worked perfectly so we rolled it i= nto production=85.production took 45 minutes to start for us(especially one= node under pressure)=85.that was only initially though=85now everything se= ems fine. =A0Another option in QA was we could have tested upgrading to 1.1= .9 first then to 1.2.2. =A0I have no idea if it will work but I am sure the= y test closer release scenarios on upgrading more so than the big jump rele= ases

Aaron, it would be really neat if some releases were tagged with LT(long te= rm) or something so upgrades are tested from LT to LT releases so we know w= e can always safely first upgrade to an LT release and then upgrade to anot= her LT release from that one=85just a thought. This would also get more peo= ple using/testing the same upgrade paths which would help everyone.

Dean

From: Alain RODRIGUEZ <arodr= ime@gmail.com<mailto:arodrime@= gmail.com><mailto:arodrime@= gmail.com<mailto:arodrime@gmai= l.com>>>
Reply-To: "user@cassandra= .apache.org<mailto:user= @cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cassandra.apache.org<mail= to:user@cassandra.apache.org><mailto:user@cassandr= a.apache.org<mailto:use= r@cassandra.apache.org>>>
Date: Thursday, March 14, 2013 5:31 AM
To: "user@cassandra= .apache.org<mailto:user= @cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" <user@cassandra.apache.org<mail= to:user@cassandra.apache.org><mailto:user@cassandr= a.apache.org<mailto:use= r@cassandra.apache.org>>>
Subject: Re: Failed migration from 1.1.6 to 1.2.2

We have it set to 0.0.0.0 but anyway, as told before, I don't think our= problem come from this bug.


2013/3/14 Michal Michalski <m= ichalm@opera.com<mailto:michalm= @opera.com><mailto:michalm@o= pera.com<mailto:michalm@opera.c= om>>>

It will happen if your rpc_address is set to 0.0.0.0.

Ops, it's not what I meant ;-)
It will happen, if your rpc_address is set to IP that is not defined in you= r cluster's config (e.g. in cassandra-topology.properties for PropertyF= ileSnitch)


M.


M.

W dniu 14.03.2013 13:03, Alain RODRIGUEZ pisze:
Thanks for this pointer but I don't think this is the source of our
problem
since we use 1 data center and Ec2Snitch.



2013/3/14 Jean-Armel Luce <j= aluce06@gmail.com<mailto:jaluc= e06@gmail.com><mailto:jaluc= e06@gmail.com<mailto:jaluce06@= gmail.com>>>

Hi Alain,

Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-= 5299

A patch is provided with this ticket.

Regards.

Jean Armel


2013/3/14 Alain RODRIGUEZ <a= rodrime@gmail.com<mailto:arodr= ime@gmail.com><mailto:arodr= ime@gmail.com<mailto:arodrime@= gmail.com>>>

Hi

We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2.

This has been a disaster. I just switch one node to 1.2.2, updated its
configuration (cassandra.yaml / cassandra-env.sh) and restart it.

It resulted on error on all the 5 remaining 1.1.6 nodes :

ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750
AbstractCassandraDaemon.java (line 135) Exception in thread
Thread[RequestResponseStage:2,5,main]
java.io.IOError: java.io.EOFException
=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowReso= lver.java:71)

=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:155)
=A0 =A0 =A0 =A0 =A0at
org.apa= che.cassandra.net<http://org.apache.cassandra.net><http://org.apache.cassandra.net>.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)

=A0 =A0 =A0 =A0 =A0at
org.apache.ca= ssandra.net<http://org.apache.cassandra.net><http://org.apache.cassandra.net>= .MessageDeliveryTask.run(MessageDeliveryTask.java:59)

=A0 =A0 =A0 =A0 =A0at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j= ava:886)

=A0 =A0 =A0 =A0 =A0at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:= 908)

=A0 =A0 =A0 =A0 =A0at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException
=A0 =A0 =A0 =A0 =A0at java.io.DataInputStream.readFully(DataInputStream.jav= a:180)
=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.jav= a:100)

=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.jav= a:81)

=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowReso= lver.java:64)

=A0 =A0 =A0 =A0 =A0... 6 more

I had this a lot of times, and my entire cluster wasn't reachable by our
4 clients (phpCassa, Hector, Cassie, Helenus)

I decommissioned the 1.2.2 node to get our cluster answering
queries. It
worked.

Then I tried to replace this node by a new C*1.1.6 one with the same
token as the previous node decommissioned. The node joined the ring and
before getting any data switch to normal status.

In all the other nodes I had :

ERROR [MutationStage:8] 2013-03-14 10:21:01,288
AbstractCassandraDaemon.java (line 135) Exception in thread
Thread[MutationStage:8,5,main]
java.lang.AssertionError
=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:304)=

=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:3= 71)

=A0 =A0 =A0 =A0 =A0at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
=A0 =A0 =A0 =A0 =A0at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
=A0 =A0 =A0 =A0 =A0at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
=A0 =A0 =A0 =A0 =A0at java.util.concurrent.FutureTask.run(FutureTask.java:1= 38)
=A0 =A0 =A0 =A0 =A0at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j= ava:886)

=A0 =A0 =A0 =A0 =A0at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:= 908)

=A0 =A0 =A0 =A0 =A0at java.lang.Thread.run(Thread.java:662)

So I decommissioned this new 1.1.6 node and we are now running with 5
servers, not balanced along the ring, without any possibility of adding
nodes, nor upgradinc C* version.

We are quite desperate over here.

If someone has any idea of what could happened and how to stabilize the
cluster, it will be very appreciated.

It's quite an emergency since we can't add nodes and are under heav= y
load.








--f46d04016b3b844c5c04d7e3ebc5--