incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Brophy <cbro...@zulily.com>
Subject Re: Invalid Counter Shard errors?
Date Fri, 07 Sep 2012 16:40:01 GMT
I have a very reliable repro case on our cluster involving nodetool repair.
I posted a summary in a comment on the issue. Let me know if more details
are needed.

Charles

On Fri, Sep 7, 2012 at 8:35 AM, Sylvain Lebresne <sylvain@datastax.com>wrote:

> > Is there a way to fix this error ? What is its impact on my data ?
>
> The fact that the message shows means that Cassandra has attempted to
> "repair" the problem so there isn't much to do. However the fact that
> you do get the messages in the first means that there is a bug
> somewhere that generate those.
> Now as Peter said, we don't know what is that bug that generate this
> problem
>
> > What is its impact on my data ?
>
> The problem is that as Peter said, we actually don't know what is
> causing that problem. What the message said though is that two
> different values have been found for a given counter (it's two
> different values for a sub-part of the counter but that's a technical
> detail). Now what the code does to "repair" in that case is to pick
> the higher of the two value it has. But honestly that's random,
> there's a 50/50 chance that it will pick the right value.
>
> The main problem is that I have not clue how to reproduce this easily,
> which makes it really hard to track. If someone finds a way to
> reproduce, please do share by all mean (on
> https://issues.apache.org/jira/browse/CASSANDRA-4417 typically). What
> I can suggest is that if you have a log with multiple instances of
> said log message, you attach it to the ticket. I can have a look to
> see if there is some pattern between the different occurrences that
> suggest a reason why this happen. But to be honest I have some doubts
> that it will help much short of having a way to reproduce.
>
> I will also note that we did fixed a bug that was affecting counters
> in 1.1.3 (https://issues.apache.org/jira/browse/CASSANDRA-4436). I
> don't really think this could be the cause of what you are seeing, but
> there is a slim chance that I'm wrong on that. So it's probably worth
> upgrading to be sure.
>
> --
> Sylvain
>

Mime
View raw message