cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Héctor Izquierdo Seliva <izquie...@strands.com>
Subject Re: Repair doesn't work after upgrading to 0.8.1
Date Tue, 05 Jul 2011 08:00:04 GMT
Hi All, sorry for taking so long to answer. I was away from the
internet.

>> Héctor, when you say "I have upgraded all my cluster to 0.8.1", from
> >> which version was
> >> that: 0.7.something or 0.8.0 ?

0.7.6-2 to 0.8.1

> This is the same behavior I reported in 2768 as Aaron referenced ...
> > What was suggested for us was to do the following:
> >
> > - Shut down the entire ring
> > - When you bring up each node, do a nodetool repair
> >

That's exactly what I ended up doing. Repair now works. I tried to do a
rolling restart with 2818 applied, but it did not work.

> However, in the issue reported, it was unable to be reproduced ... I'd
> > be curious to know how Hector's keyspace is defined.  Ours at the
time
> > was RF=3 and using Ec2 snitch...

Nothing special, Default snithch, RF=3.

I think this should be prioritized, as having to restart the whole
cluster is a bit extreme. We don't have separate DCs, so I had to
incurre on downtime, which costs money, and a little bit of grief.


El vie, 01-07-2011 a las 10:16 +0200, Sylvain Lebresne escribió:
> To make it clear what the problem is, this is not a repair problem. This is
> a gossip problem. Gossip is reporting that the remote node is a 0.7 node
> and repair is just saying "I cannot use that node because repair has changed
> and the 0.7 node will not know how to answer me correctly", which is the
> correct behavior if the node happens to be a 0.7 node.
> 
> Hence, I'm kind of baffled that dropping a keyspace and recreating it fixed
> anything. Unless as part of "removed the keyspace", you've deleted the
> system tables, in which case that could have triggered something.
> 
> --
> Sylvain
> 
> On Fri, Jul 1, 2011 at 9:33 AM, Sasha Dolgy <sdolgy@gmail.com> wrote:
> > This is the same behavior I reported in 2768 as Aaron referenced ...
> > What was suggested for us was to do the following:
> >
> > - Shut down the entire ring
> > - When you bring up each node, do a nodetool repair
> >
> > That didn't immediately resolve the problems.  In the end, I backed up
> > all the data, removed the keyspace and created a new one.  That seemed
> > to have solved our problems.  That was from 0.7.6-2 to 0.8.0
> >
> > However, in the issue reported, it was unable to be reproduced ... I'd
> > be curious to know how Hector's keyspace is defined.  Ours at the time
> > was RF=3 and using Ec2 snitch...
> >
> > -sd
> >
> > On Fri, Jul 1, 2011 at 9:22 AM, Sylvain Lebresne <sylvain@datastax.com> wrote:
> >> Héctor, when you say "I have upgraded all my cluster to 0.8.1", from
> >> which version was
> >> that: 0.7.something or 0.8.0 ?
> >>
> >> If this was 0.8.0, did you run successful repair on 0.8.0 previous to
> >> the upgrade ?
> >



Mime
View raw message